In this installment I'm going to cover where we are at, the good and the bad, and set it up for where we're going.
The ETL window is still at 14 hours but over the last 9 months we have at least kept it from it's steady rise. We recognized a need a few years back to concentrate on capturing and processing only the changed (delta) rows from Banner. At the time we were still on Oracle 9i so some of our options were limited. We could not use a logical standby because Banner has LONG and RAW datatypes and in 9i those were not supported with a logical standby. We couldn't use triggers because of our prime directive as stated in the previous post. If we avoided the tables with LONG and RAW datatypes we thought we might be able to use Oracle Streams. It was new under 9i but we put up a test Streams environment and played with it and it looked promising. We did the requisite paper work to submit the idea to upper management in both departments, got a pilot approved and started to plan the pilot. We had hoped to bring in an Oracle consultant for the Streams pilot but that all fell through and we ended up doing it ourselves by digging into the manuals. We were heading for 10g so that made it even better because 10g Streams could handle all the Banner datatypes. Along the way were told by some experts on a conference call to wait for 10gR2 due to some bugginess in 10gR1 for Streams in particular. This slowed us own a bit as well. Finally in May of 2006 the pilot began in earnest.
Streams did not deliver the delta of course, but it replicated the data and we put triggers in our target environment to capture the delta. We could have used Change Data Capture under 10g which uses the same basic Streams technology to capture just the delta but we needed the full row for the transformations so we chose to continue with Streams. We also decided early on that the biggest benefit to Streams under 10g was the ability to perform the capture downstream on a different host from the Banner host so as not to add any load to our Banner production database. Being efficient we also chose to try to re-use the same logs shipped to one of our physical standbys rather than add another destination to the prod DB. That decision, while saving disk space and some overhead on the source server to ship the logs to another destination, proved to be a bad decision for several reasons. First that we could find no one else who was doing downstream capture who could help us, second since we did not get a consultant and had to follow the manual we were off the map with our log re-use decision. The manual gives you very precise steps for setting things up in a downstream mode but in an effort to avoid outages and impact to the Banner database we decided to populate our Streams tables from our physical standby. We would stop the standby, put it in read-only mode and have the ETL developers copy the data from it to our staging database which was our Streams target database. We fought a ton of problems here. Some ETL maps had logic in them and did not copy all the rows. Streams did not like that when it could not find a row it expected to be on the target and it caused us a ton of grief. Another problem was the instantiation SCN. When you set up the Capture rules for the table you populated it runs a command on the source DB, the real source not the standby, and that SCN and where we had stopped the standby were not the same.
We had decided early on to bite off this problem one small chunk of tables at a time. To start streaming the data, then turn off the extract and change the transform logic to process the delta rows captured by our triggers in our staging database off the data arriving via Streams. Well, the best way to put this is that we continued to shoot ourselves in the feet with really dumb errors from typos that caused the Streams setup to have missing pieces, to ETL maps that weren't checked in, to errors in the reworked transforms, to you name it. Our decision to take this in small chunks versus a big bang implementation of Streams was also a bad idea.
So there are two key items for anyone considering Oracle Streams. If you're going to use a downstream capture then follow the stinking manual and hang the efficiencies of re-usable logs. Second, determine all the tables you want to Stream up front and set them up in Streams in one fell swoop during an outage on your primary database. The ETL or whatever that will use the data on the targets can do their bits a little at a time but Streams doesn't play well here. The reason is that Streams enforces transactional integrity. So, if you configure 5 tables to start with, which is what we did, and then when you add the second set of 4 tables and make a typo like we did, Streams errors out on one of the new tables but because that table was in a transaction with two of the first 5 tables those tables are also affected. So Streams held up the entire transaction which means that our simple typo mistake cascaded backward to tables that had been running well in Streams before. Another time the ETL truncated and overwrote our Streams table which did not make Streams real happy nor those of us on the project. If we had configured all the tables up front we would not have had as many cascading problems pop up as we did.
Even with so many problems and bad decisions there is some good in all of this. First as I said we have stopped the rising ETL times and secondly Streams has proven to be very reliable. The problems we have faced are self-inflicted. Also as a side benefit this project has caused the ETL folks to look at their processes in depth and in doing so they found some problems and also learned some things as well. This will pay off in the future as some bigger changes are coming. The next installment will cover where we're going from here now that we've learned a lot from our mistakes.
Recent comments
1 day 18 hours ago
1 week 5 days ago
2 weeks 9 min ago
2 weeks 21 hours ago
3 weeks 3 days ago
3 weeks 6 days ago
5 weeks 5 days ago
5 weeks 6 days ago
6 weeks 16 hours ago
6 weeks 19 hours ago