Extract(Read) > Transform(Test&Apply) > Load(Learn) Blog

January 30, 2008

Oracle OLAP features performance testing Part 2

Filed under: Oracle Business Intelligence,Oracle Performance — H.Tonguç Yılmaz @ 2:06 pm

In Part 1 I shared our testing environment setup, read performance tests and their results and some important resources for data warehousing with Oracle. Before continuing with the write cost of the options in this post, let me mention two things; first I am sorry that I can not share all details of the table structures and queries etc. in these posts, so please accept these posts as non-technical. Second, I think in both worlds, OLAP or OLTP doesn’t matter, first thing to do is to govern your end users needs and wants, if you let them govern you then the result will be most probably a very complex(flexible they will call:) model which will inevitably end up with of course lots of challenging performance issues. When it comes to OLAP this complexity at the transformation step of your ETL may cost you a lot, especially if we are talking about some XX TBs of daily change.

During this post I will be using “BI” for Bitmap Index, “MV” for Materialized View and “BJI” for Bitmap Join Index.

a. Cost of Bitmap Indexes

We planned seven BIs on our main fact table for star transformation tests with different kinds of queries having different filtering levels. During below tests there were no MV logs or BJIs in the test schema;

Listing 192a – Cost of Bitmap Indexes

-- a simple local bitmap index creation template
CREATE BITMAP INDEX BI.. NOLOGGING PARALLEL COMPUTE STATISTICS LOCAL ;

-- %2 and %10 sampled data inserted to simulate ETL cost
create table FACT_SMP2 nologging parallel pctused 99 pctfree 1 as
select /*+ PARALLEL */ * from FACT SAMPLE(2);

insert /*+ append parallel(a) */ into FACT a
select /*+ PARALLEL(b) */ * from FACT_SMP2 b ;
commit ;

b. Cost of Materialized Views

For our reporting needs we created three different kinds of summaries. And during refresh tests;
- all BIs were dropped,
- all MVs were created on a single partition of the main fact because of the test system’s limited resources,
- because of our ETL character we tested %2(2,642,727 rows) and %10(13,210,698 rows) sampling inserts before MVs incremental refreshes

Listing 192b – Cost of Materialized Views

alter session set query_rewrite_enabled=true;
alter session set query_rewrite_integrity=enforced;

-- a simple template of MV log for incremental refresh of MVs
DROP MATERIALIZED VIEW LOG ON FACT ;
CREATE MATERIALIZED VIEW LOG ON FACT
WITH ROWID, SEQUENCE(all_fact_columns_joined_with_dimensions, ..)
INCLUDING NEW VALUES;

-- a simple template to create a MV
DROP MATERIALIZED VIEW MV ;
CREATE MATERIALIZED VIEW MV
PCTFREE 1 PCTUSED 99 INITRANS 7 MAXTRANS 255
STORAGE ( INITIAL 50M NEXT 50M MINEXTENTS 5 MAXEXTENTS 512 PCTINCREASE 0 FREELISTS 7)
NOLOGGING PARALLEL
BUILD DEFERRED
REFRESH FORCE ON DEMAND
WITH ROWID
ENABLE QUERY REWRITE AS
SELECT ...

-- a simple template to FULL refresh of MV and gather statistics
exec dbms_mview.refresh('MV', method => 'C', parallelism => 4, atomic_refresh=>false);
exec dbms_stats.gather_table_stats( USER, tabname=>'MV', degree=>4, estimate_percent=>dbms_stats.auto_sample_size,
cascade=>TRUE, method_opt=>'FOR ALL COLUMNS SIZE AUTO', granularity=>'ALL');
commit ;

-- a simple template to INCREMENTAL refresh of MV and gather statistics
exec dbms_mview.refresh('MV', method => 'F', parallelism => 4);
exec dbms_stats.gather_table_stats( USER, tabname=>'MV', degree=>4, estimate_percent=>dbms_stats.auto_sample_size,
cascade=>TRUE, method_opt=>'FOR ALL COLUMNS SIZE AUTO', granularity=>'ALL');
commit ;

MVs bring important performance costs in order to be kept fresh and compared to star transformation with BIs MVs read performance advantage, this cost may be something to be avoided.

c. Cost of Bitmap Join Indexes

BJIs advantages and disadvantages are summarized like below in the documentation;

Advantages –
Good performance for join queries, space efficient
Especially useful for large dimension tables in star schemas

Disadvantages –
Maintenance costs are higher, building or refreshing a bitmap join index requires a join
More indexes are required, up to one index per dimension-table column rather than one index per dimension table is required

During our tests for reporting needs appropriate BJIs were nearly impossible to create. Additionally to create a BJI you need to have a primary or unique key constraint on the referenced table.

-- BJI requires a PRIMARY or UNIQUE key constraint
CREATE UNIQUE INDEX UI.. ON REFERENCED_DIMENSION(..) NOLOGGING PARALLEL COMPUTE STATISTICS ;   

-- will use above index by default
ALTER TABLE REFERENCED_DIMENSION
ADD CONSTRAINT PK_REFERENCED_DIMENSION
PRIMARY KEY (..) NOLOGGING PARALLEL ;

-- a simple BJI creation template
CREATE BITMAP INDEX BJI..
ON FACT(columns_will_be_used_on_referenced_dimension, ..)
FROM FACT, REFERENCED_DIMENSION
WHERE ..
NOLOGGING PARALLEL COMPUTE STATISTICS LOCAL ;

BJI 1 creation time and size
	Elapsed: 04:56:25.04
	15,625 GB

BJI 2(an alternative for BJI 1) creation time and size
	Elapsed: 04:59:28.05
	15,625 GB

For this series the question was when to choose which Oracle OLAP option;
- Star Transformation with Bitmap Indexes or
- Summary Management with Materialized Views or
- Bitmap Join Indexes or
- of course full and parallel partitioned table scans with hash joins

I may easily say after ending the first round of the performance tests on the test system it will all depend on your own reporting needs(read) and ETL character(write). So you need to test both read and write ends and decide for each condition’s best fit. Usually you will end up with a mixture of these solutions I guess. And remember that some of the performance gains may be so attractive that these results might guide you to modify your model, so it is always better to have this kind of performance test results during the development phase.

Last words goes to 10G Release 2(10.2.0.3) ; you have my word on that all mentioned options are very stable, so they really worth your testing efforts on this release and truncate all your past experiences :)

This one is for old C Lovers

Filed under: Oracle Other — H.Tonguç Yılmaz @ 6:46 am

I was reading Bilal Hatipoğlu‘s

Whence C? Why C? Whither C?

Differences between C and C++

posts and I felt how I loved and I am missing programming with C.

Nowadays PL/SQL is my favorite development environment but one can never forget his first love right. Also I believe C is still the only platform you may choose when you decide to implement a million dolar software :)

I will be attending a presentation by Ogün Heper on 11g Oracle Call Interface(OCI ) New Features next weekend, so I plan to blog a little more later.

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 76 other followers