Extract(Read) > Transform(Test&Apply) > Load(Learn) Blog

June 3, 2008

And the worst data migration practices oscar goes to..

Filed under: Oracle Performance — H.Tonguç Yılmaz @ 5:57 pm

Siebel! Yes, the CRM software Oracle owns for some time;
http://www.oracle.com/applications/crm/siebel/index.html

I think this application is one of the best example of Kyte’s “database independent” black box applications, which usually means designed to work on every database platform but born to perform and scale bad on all of them. The reason is simple, it do not use any feature of your already purchased database software like partitioning, parallel query or even constraints.

Oh but Siebel loves indexes, yes believe it or not they indexed nearly each column of their heacy OLTP write tables, so I think they earned this oscar properly :)

Last week my help was requested for a Siebel data migration. After several minutes of monitoring I thought of leaving the building immediately. The migration utility inside Siebel did not allow you to define hints, like APPEND, did not support parallel query option, indexes on huge tables were not allowed to be dropped, so huge UPDATEs and INSERTs produced enormous amount of redo etc. And the root cause conclusion was of course easy, I/O capability of the system was bad :)

Only thing I was able to help was taking the availability risk and keeping target database on NOARHIVELOG mode, creating huge(10 GB each) single membered redo log files, re-collecting appropriate optimizer statistics on source and increasing buffer cache on both ends up to the physical limits of the machines. Before these actions group was unhappy with 500,000 rows per hour processing and afterwards it was 3,000,000 rows per hour and everybody was thankful. But I believe this 20 days and night data migration could finish within hours if Siebel was only an Oracle database friendly application, meaning if it allowed dropping indexes before migration starts and rebuilding after migration ends, allowing CREATE TABLE as SELECT in PARALLEL and NOLOGGING instead of these huge UPDATEs and INSERTs. I experienced TBs of data moving around within hours with these simple data migration best practices.

So let’s hope maybe some day Oracle will force the tools it buys to have an Oracle friendly version also and lets its customers to utilize their purchases.

ps: if you want to continue reading on data migration experiences of mine please check these out;

Direct Path Load Operations versus Integrity Constraints, Triggers and Instance Recovery
http://tinyurl.com/6xwxyt

I am back :)
http://tinyurl.com/6gk2wo

Some Best Practices of Upgrading Oracle Database
http://tinyurl.com/5ds6hv

4 Comments »

  1. Unfortunately you have been misinformed.

    For initial data loads, you are allowed to drop non-unique, unused indexes on Siebel base tables and reinstate them (in parallel/nologging for speed) after the load is complete.

    EIM is a fairly deterministic ETL engine and as long as you specify USE INDEX HINTS = TRUE, there often isn’t much scope for any additional tuning any queries.

    Always ensure ONLY BASE COLUMNS is explicitly specified to limit the amount of EIM processing.

    EIM does support /*+ APPEND */ mode – via the IFB parameter ORACLE INSERT APPEND = TRUE.

    Normally EIM scales pretty well and the limiting factor should be the number of CPU’s on the database server rather than the I/O subsystem.

    You can run multiple EIM tasks in parallel each operating on different batch ranges. I normally use batches of 10,000 records.

    There is a Siebel myth that EIM interface tables can only be loaded with more than 250,000 records. This is (obviously) not true and we routinely load millions (usually all records) for each entity. Use direct path SQL*Loader to load EIM interface tables, not row at a time PL/SQL :-)

    Normally it’s a good idea to separate EIM tables from Siebel base tables onto different LV’s, disk and controllers.

    The optimal number of EIM tasks varies. Throughput for narrow tables with few dependencies is much quicker than complex objects (lots of columns, indexes and FK relationships) like Contact, Account.

    3 million records an hour isn’t bad but we have got throughout approaching 22 million records per hour.

    HTH

    Andy
    ex-Siebel, now Oracle (Advanced Customer Services)

    Comment by nbrightside — June 4, 2008 @ 7:22 pm | Reply

  2. Andy I am so glad you commented, I will be discussing all the items on your comment with our data migration team. But if possible can you also share documentation to follow these information as reference?

    Thank you! :)

    Comment by H.Tonguç Yılmaz — June 5, 2008 @ 4:34 am | Reply

  3. Most should be covered in the EIM Administration Guide and Performance Tuning volumnes in Siebel Bookshelf.

    What won’t be explicitly covered is dropping indexes on base tables. However for an initial data migration into a scratch database, this is allowed.

    Siebel don’t support dropping any of the 15,000 indexes even if they are ‘unused’. However, this is for conventional OLTP usage once the application is deployed.

    There is more latitude when you are migrating data.

    Tip: Run, 1000 records through each EIM task monitoring index usage to identify the unused indexes which are candidates to be dropped. Never, ever drop a unique index. Grab the DDL so you can easily reinstate them when the load is complete. Don’t use Siebel DDLSYNC utility which would do the job but is serial and slow. However, *do* run a DBCHCK afterwards to ensure that all required indexes have been re-created.

    [ If you need to tallk, feel free to email ‘andycowl at gmail dot com’ ]

    Comment by nbrightside — June 5, 2008 @ 6:37 pm | Reply

  4. There is a lot of documentation on Siebel upgrade, just search on oracle.com. In fact, Siebel upgrade is automated and “database independent” out of the box, and at the same time it is highly customizable. Siebel Expert services used to do it all the time. There are utilities like Upgrade Tuner that enables parallelism, excludes zero-row SQLs and does many other tricks. There are ways of dropping and inactivating the indexes too. It was designed this way, – each customer data shape is different, so there is no “one-size-fits-all” solution available out of the box. For best results, customized upgrade tuning is necessary. You just need to know what and how to do. This is why the customers are hiring experts, right?

    Comment by paulblokhin — December 9, 2009 @ 7:53 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: