Extract(Read) > Transform(Test&Apply) > Load(Learn) Blog

September 26, 2010

Oracle Communications Data Model seçeneğine giriş(Getting started with Oracle Communications Data Model option)

Filed under: Oracle Business Intelligence — H.Tonguç Yılmaz @ 5:42 am

Oracle yeni bir opsiyon olarak OCDM’i satışa sundu. Özetle endüstri standardı bir Telco veri modeli yanında OWB iç ETL işlevleri + OLAP küpleri + OBI EE dashboard raporları + Data Mining raporları paket ile hazır geliyor. Dolayısı ile bu opsiyon ile bir proje gerçekleştirmek istendiğinde beklenen kaynak sistemleri bağlayacak ETL’i hazırlamak ve ihtiyaçlar paralelinde özelleştirme yapmak ki özelleştirmeye kapalı bir opsiyon değil OCDM sadece ileriki sürümler için bakım sorumluluğunu üstlenmek gerekiyor.

Fakat lisans maliyeti beni düşündürüyor çünkü CPU bazına bu çözüm gömülü lisanslama modeli ile bile mininimum ~120K civarında olur gibi(DB EE+Partitioning+OLAP+Mining+OCDM+OBIEE). Dolayısı ile 32 işlemcili bir ufak DWH ortamında maliyet 1/5’ine inince bile sadece lisans tutarı 500K üzerine çıkıyor(üstüne donanım ve danışmanlık koyunca müşteri bulmak cidden zor olacaktır Avrupa dışına çıkıldığında bence).

“Green field” olarak gruplandırabilecek DWH-BI kabiliyetlerine sahibi olmak isteyen ama tam olarak da neye sahip olacağını kestiremeyen potansiyel Telco müşterilere ürün içindeki hazır raporlar ve mining çıktıları satılmayı oldukça kolaylaştıracaktır. Ek olarak Oracle ürünü birlikte satmak ve zenginleştirmek için BI alanındaki iş ortaklarına önemli destek sağlaya da bilir gibi bu yeni seçenek için. AVEA ürünün gelişiminde EMEA bölgesindeki önemli referanslardan biri imiş. Son olarak biraz Türk işi olacak ama Metadata anlamında modelin ilişkilerden ilham almak, kurup denemeler yapmak ise ücretsiz :) Oracle her zamanki gibi doc. kalitesini yüksek tutmuş, konu ilginizi çekti ise kaynaklar kısmındaki bağlantılara göz atabilirsiniz.

KEY FEATURES
• More than 800 Tables and 11,000 attributes
• More than 1,300 industry-specific measurements and key performance indicators (KPIs) such as network reliability statistics, provisioning metrics, and customer churn propensity
• Alignment with communications industry standards including the TM Forum Information Framework (SID), formerly known as the Shared Information Model
• Prebuilt OLAP cubes for extremely fast dimensional analysis of business information
• Embedded data mining models for sophisticated trending and predictive analysis
• A portfolio of more than 70 dashboards and reports

KEY BENEFITS
• Faster time to actionable information
• Integrates multiple lines of business including mobile, wireline, and cable
• Addresses key areas such as customer management, marketing segmentation, product development, and network health
• Fully aligned with the TM Forum Information Framework (SID) standard

KAYNAKLAR
OTN OCDM sayfası
Oracle Communications Data Model Reference11g Release 2 (11.2)
OCDM Installation Guide11g Release 2 (11.2)
Oracle Communications Data Model – Overview
Oracle Communications Data Model (OCDM) Tutorial
Oracle Retail Data Model (ORDM)

Advertisements

August 23, 2010

Oracle veri madenciliği seçeneğine giriş(Getting started with Oracle Data Mining option)

Filed under: Oracle Business Intelligence — H.Tonguç Yılmaz @ 5:18 pm

Önceki Türkçe yazıma gösterdiğiniz ilgi için teşekkürler, umarım TROUG’nin yolu açık olur ve hepimize önemli katkı sağlar bu grup. Hazır başlamışken bir süre Türkçe yazmaya devam edelim istedim, umarım hoşunuza gider :)

Bu yazı ile veri madenciliğini merak eden arkadaşlara giriş yapabilecekleri kaynaklara nasıl ulaşabilecekleri hakkında fikir vermek istedim. Biliyorsunuz belki bu sene OOW’de bir sunumumuz olacak, bir süredir SAS kullanıcısı olan Turkcell’de In-Database Data Mining denemeleri yapıyorduk, sonuçlarını Deniz paylaşıyor olacak, gidenlere şimdiden reklamını yapmak istedim.

TurkCell’s Pre-paid Churn Prediction Model with Oracle Data Mining option on ExadataV2, OOW10 21 Sep. 12:30 -13:30 Moscone South

Öncelikle neden veri madenciliği diye bir soru akla gelebilir, hemen bu yazı fırsatı ile güzel Türkçe’siyle Burak’ın yazılarını paylaşmak istiyorum muhtemel cevaplar için. Ve eğer cidden bu konuya ilgili iseniz Oracle® Data Mining Concepts(11.2) kılavuzuna mutlaka göz atmanızı öneririm.

Peki neden Oracle ile veri madenciliği diye bir soru da akla gelebilir, özelliklede bu seçeneğin Enterprise Edition üzerine ayrıca ek maliyet ile alınabilecek seçenekler listesinde olduğunu düşünürsek, Oracle’ın savı eğer üzerinde çalışacağınız veri hazırda veritabanı içinde ise veritabanı içinde veri madenciliğini başlamak ve tamamlamak önemli bir avantaj. Bu pazarda Oracle ve diğerlerinin en önemli rakipleri SAS, Oracle’ın dünya BI veritabanı pazarındaki konumunu düşünürsek çok keskin bir rekabet avantajı oluşturduğunu Oracle’a bu yaklaşımın hemen hissedebilirsiniz. SAS lisans maliyetlerinin Oracle Data Mining seçeneği lisans maliyetleri ile kıyaslanamayacak kadar pahallı olduğunu da belirtmekte fayda var tabii bu noktada.

Hazırda Oracle veritabanı kullanıcısı iseniz 12 adet hazır machine learning algoritması ve XE dahil tüm sürümlerde ek ücret ödemeden kullanabileceğiniz 50+ statistical function keşfetmeniz için sizi bekliyor. Oracle’da bir veri madenciliği modelini SQL*Plus’dan basit birkaç PL/SQL API çağrımı ile yaratılan ve SQL ile çağrılabilen bir şema nesnesi olarak düşünebiliriz basit anlamda(bir örnek altta fikir vermesi için paylaşıyorum). Elbette bu modeli yaratırken ve çalıştırırken elinizin altındaki Oracle sürümünün tüm kabiliyetlerinden de sonuna kadar faydalanabilirsiniz(Exadata, Parallel Query, Partitioning, Bitmap Index gibi).


-- Faud Prediction Demo
drop table CLAIMS_SET;
exec dbms_data_mining.drop_model('CLAIMSMODEL');
create table CLAIMS_SET (setting_name varchar2(30), setting_value varchar2(4000));
insert into CLAIMS_SET values ('ALGO_NAME','ALGO_SUPPORT_VECTOR_MACHINES');
insert into CLAIMS_SET values ('PREP_AUTO','ON');
commit;
begin
   dbms_data_mining.create_model('CLAIMSMODEL', 'CLASSIFICATION','CLAIMS', 'POLICYNUMBER', null, 'CLAIMS_SET');
end;
/

--Top 5 most suspicious fraud policy holder claims
select * from(
   select POLICYNUMBER, round(prob_fraud*100,2) percent_fraud,
   rank() over (order byprob_frauddesc)rnk from(
      select POLICYNUMBER, prediction_probability(CLAIMSMODEL, '0' using *) prob_fraud
      from CLAIMS
	  where PASTNUMBEROFCLAIMS in ('2 to 4', 'more than 4')))
where rnk<= 5 order by percent_frauddesc; 

Oracle Data Mining seçeneği Enterprise Edition veritabanınızı kurarken sizin için hazır ediliyor, kısa zaman önce Apex’e taşınan Oracle By Example(OBE) Data Mining tutorial’larını takip edebilirsiniz kurulum hızlı bilgileri için. Tutorial yanında kurulum adımlarını detayları ile Oracle® Data Mining Administrator’s Guide (11.2)’dan da takip etmek mümkün. Ama ben bu kurulumlarla uğraşamam şimdi ve merak içindeyim birkaç tur deneme yapmak isterim derseniz de apex.oracle.com benzeri bir cloud düşünülmüş bu seçenek için de; Oracle 11gR2 + Data Mining seçeneği + Oracle Data Miner klasik kullanıcı arayüzü ve örnek datasetler için Amazon Cloud’a erişebilirsiniz(ne yazık ki apex.oracle.com’daki gibi ücretsiz değil ama).

Oracle Data Miner(ODM) ise ücretsiz bir GUI, güncel sürümlerinde çok da başarılı olduğunu iddia etmeyen bu uygulamayı OTN’den indirerek ücretsiz kullanabiliyorsunuz Mining çalışmalarını kolaylaştırması amacı ile. ODM’in yeni sürüm ön yüzleri bu OOW’de yüksek ihtimal ile OTN’den indirilebilir olacağını düşündüğüm SQL Developer’ın yeni sürümü ile entegre olacak, SAS ile rekabete odaklanmış bir sürüm bu, ekran görüntüleri için Oracle Data Mining Blog’a göz atabilirsiniz, SAS kullanıcılarının vazgeçilmezi GUI kabiliyetleri motivasyonunu ekarte edebilmek için tıpa tıp benzerlikler göreceksiniz. Yine bu Blog’da yeni sürüm ile SAS modellerini Oracle’a otomatik taşımaya imkan veren DBMS_DATA_MINING.IMPORT_MODEL örneklerini de bulabilirsiniz, özetle anlayacağınız saldırı çok sağlam :)

Giriş için bu kadar bilgi yeterli sanırım, eksik olmasın bir de herhangi bir sorun yaşadığınızda sorularınıza ücretsiz bir üyelik alarak hızla cevap alabileceğiniz OTN Data Mining forumu ve Oracle® Data Mining Application Developer’s Guide (11.2)’dan bahsetmemiz yerinde olur. Eğer ilgi görürse bu yazımı örnekler ile bu doküman üzerinden ilerletmeyi düşündüm.

Kaynakça: Oracle Data Mining 11g Release 2 Overview and Demo presentation – Charlie Berger, Sr. Director Product Management at Oracle, Data Mining Technologies

Not: Konu hoşuma gitti ama Oracle değil de açık kaynak ne var, okulda proje amaçlı Java ile felan ne kullanabiliriz derseniz de Weka’ya göz atmanızı öneririm.

June 19, 2010

Our OOW10 session proposal – Case Study: Prepaid churn model with Oracle Data Mining

Filed under: Oracle Business Intelligence — H.Tonguç Yılmaz @ 9:48 am

If you like the content just sign in with your free OTN account and a click on the vote button, your vote may make it possible to present, thanks :)

https://mix.oracle.com/oow10/proposals/10773

Subject: Case Study: Prepaid churn model with Oracle Data Mining

Abstract: This session is about a churn prediction study in Turkcell which is the leading GSM operator in Turkey. The session will start with an overview of the churn prediction problem and the existing data mining system in Turkcell so that the participants have some knowledge about the churn problem and the existing solution to this problem.

The session will also cover the advantages of in-database data mining and Oracle Data Mining. It will continue with the prepaid churn model which was created by using Oracle Data Mining. Then, the results and the performance that we get using Oracle Data Mining will be discussed.

About the presenter: Necdet Deniz Halıcıoğlu; I work for Turkcell Technology in Turkey. Since I began to work for Turkcell, Oracle Database and Oracle products became a part of my life. I have worked on ODS and Data Mining Projects. I also have an interest on churn prediction problem. I had several case studies about churn prediction. Data mining is getting more attractive for me every day and I like it more every day.

May 22, 2010

Oracle Partitioning and Parallel Execution for Performance seminar by Tanel Poder

Filed under: Oracle 11g New Features,Oracle Business Intelligence,Oracle Seminars — H.Tonguç Yılmaz @ 2:44 pm

Tanel was our visitor last week as I mentioned earlier, he did a one day seminar on partitioning and parallel execution. There were lots of interesting discussions for me, I tried to share some of my notes just to give an idea how Tanel’s experience may affect your daily Oracle database administration or development routine.

Other ways for detecting partition pruning rather than Pstart and Pstop info at DBMS_XPLAN:
– Event 10128, semi documented in Note 209070.1, create partition pruning tracing table under your schema KPAP_PRUNING
– Samping ROW_WAIT_OBJ# on V$SESSION with Snapper or ASH, you get the dba_object.data_object_id values
– Taking snapshots of V$SEGMENT_STATISTICS logical IO for each partition segment

Partitioning for OLTP DML Concurrency
– Partitioning can be effectively used for reducing concurrency on right hand index blocks in OLTP(ever increasing sequence or timestamp based columns)
– Hash sub partitioning for meaningless pkey columns, the kgghash() function used for row placement by hash is optimized for power of 2 buckets
– Partitioning for data load concurrency, below option will take TM lock only on the partition and allow multiple load operations on the same table
INSERT /*+ APPEND */ INTO taget_table PARTITION(target_partition)
SELECT * FROM source_table;

Parallel Query Execution change with 10g
– In 9i the QC unparses transformed query text and send a copy for each slave for separate execution. In 10g+ each slave executes the same cursor

Parallel execution more resources
– Especially because of the PX communication overhead
– And additionally PX operation on a table requires a segment level checkpoint.
– With 11.2 in-memory parallel execution if PARALLEL_DEGREE_POLICY is set to AUTO
– So not everything should use paralelism and not every system is a candidate for parallelism, even in DWs(like ETL metadata updates etc.)

For parallel DML always remember to enable parallel dml at session level
– alter session enable parallel DML;
insert /*+ parallel(4) pq_distribute(eo none) */ into etl_orders eo
select * from tab;
– Query PDML_ENABLED, PDML_STATUS, PDDL_STATUS, PQ_STATUS columns of V$SESSION to check parallel DML is enabled or not for a session
– if data read is highly skewed, different slave produce way amount of rows, you may want to distribute rows for loading with pq_distribute(eo random) hint

Major Parallel Execution bottlenecks
– Serially executed steps in PQ plans, check for P->S and S->P operations in the middle of the execution plans. “IN-OUT” column is empty for serial operations in DBMS_XPLAN
– Too much communication between slaves, if possible replace a BROADCAST distribution with HASH or PART
– Skewed row distribution between producers and consumers, check V$PQ_TQSTAT if some slaves produced/consumed signaficantly more rows than others in the slave set, swith form HASH to BROADCAST ot PARTITION distribution
– Unbalanced hardware configuration, not enough IO bandwith, the very common reason, query ASH or run snapper of PX slaves to measure time waited for IO

Parallel Execution waith events and sql trace
– Metalink Note 191103.1 for PQ wait events
– After 10g+ before tkprof utulity to aggregate the multiple tracefiles produced for each slave use trcsess utulity

Parallel Execution and RAC
– In 10g set PARALLEL_INSTANCE_GROUPS parameter in each instance to assign the instances to groups
ALTER SYSTEM SET instance_groups = ‘rac01′,’all’,’batch’ ;
ALTER SYSTEM SET instance_groups = ‘rac02′,’all’,’batch’ ;
ALTER SYSTEM SET instance_groups = ‘rac03′,’all’,’oltp’ ;
ALTER SYSTEM SET instance_groups = ‘rac04′,’all’,’oltp’ ;
— Before running the PQ statement SET parallel_instance_groups
ALTER SESSION SET parallel_instance_groups = ‘batch’ ;
— The slaves will run only on the instances associated with the ‘batch’ group
– In 11g again parallel_instance_groups controls the allowed groups where QC can schedule inter instance PQ slaves, but this time slaves can only run on the instances which allow running the same “service” the QC connected through.
— Alternatively
ALTER SESSION set PARALLEL_INSTANCE_GROUP = ‘service1′,’service3’ ;

Parallel Execution Overview
– PX COORDINATOR: Query Coordinator(QC), distributes the work to slaves and returns results back to the client. Additionally performans work if something has to be done serially.
– PX SEND QC: Sends produced rows to QC via a table queue
– PX BLOCK ITERATOR: Operation which picks different ranges ıf PX granules to read different slaves
– Table Queues: In-memory communication channel between PX slaves, if SGA_TARGET is in use or PARALLEL_AUTOMATIC_TUNING is set to TRUE a memory buffer inside large pool
– QC(RAND): The row source is sending its rows back to QC

Things to remember
– Watch out for accidently enabled paralelism especially on OLTP, after parallel index builds and rebuilds remember to ALTER INDEX NOPARALLEL
– The Cost we see in DBMS_XPLAN does not include PX distribution cost, however CBO does choose PC distribution based on cost(10053 CBO tracefile shows that the costing is done)
– Use PARALLEL_MIN_PERCENT parameter to decide the minimum % of the requested Degree of Paralellism(DOP) query should be able to allocate in order to run the query
– PARALLEL_EXECUTION_MESSAGE_SIZE default to 16KB after 11.2
– Use ALTER SESSION FORCE PARALLEL QUERY PARALLEL n; to force session level paralelism

Oracle Partitioning and Parallel Execution for Performance seminar by Tanel Poder

ps: Check out Oracle Sweden User Group(ORCAN) Event presentations of Hüsnü

May 16, 2010

I may be back :)

Filed under: Oracle Business Intelligence,Oracle Events,Oracle Other,Quotes — H.Tonguç Yılmaz @ 1:25 pm

It has been a while again since my last blog, so I wanted to give some quick highlights.

– Turkcell bought a full Rac Sun Exadata V2 box, we are still in test phase and we plan to migrate our 100+ TB DWH this August. This installation and migration is being a major experience for both us and Oracle. My team has several customer experience presentation proposals on ODI with Exadata and CEP for OOW and UKOUG, so I hope these proposals are being accepted and we can meet to share these experiences in detail.

– We are working close with Oracle Data Mining product development team for the In-Database mining options on 11.2 and Exadata. Automated SAS model migration is an important development. We will be at Krakow, Poland for Oracle’s DWH Global Leaders event to meet with the product development team.

OracleTurk is now 11 years old, with each year the forum activity is getting much more mature, last year I passed to read only mode after one of my close friend’s warning, he told me to give chance for the new people to answer since I was mad for answering any question on this forum for years. It was very hard for me I must admit, but now I see this warning was very helpful since lots of new faces is now around the forum :)

Tanel and Tom will be at Istanbul this week, I will attend Tanel’s session at my company. If I can, I will try to share my seminar notes like I did in old days.

– My friend, colleague Hüsnü who is the first Turkish Oracle ACE Director and Oracle DBA of year 2009 has quited his job at Turkcell and started a new journey. I am very sad about this decision on one hand since now we are apart but happy for Hüsnü since I am very pretty sure that he will face and win this new challenge.

– Recently I saw that one of my Turkish DBA friend Kamran has been rewarded as an Oracle ACE, I am very happy to see young passionate and hard working people to achieve these kind of success and become another inspiration to the newbies.

It is my 8th month as the team leader for the BI development team at Turkcell group. Leading my team is very exciting, I am also inside lots of non-Oracle projects like developing a Turkish text mining and web crawler product, AbInitio Linux migration, developing a Java based ETL product etc. I have the prvilige to work with very talented young software engineers so leading my team is a great joy and experience until now. Where as being managed as a manager is something I am still being challenged, I couldn’t get used to this feeling until now. Anyway, as once Lance Armstrong said: “Pain is temporary, but quiting lasts forever.” so I continue to fight for what I believe in.

ps: I chosed this title for this post just to make an expression for my last comeback, I hope this blog may be a similar comeback :)

February 18, 2010

Aggregated available blogs and forums on Oracle’s Integration Technologies

Filed under: Oracle Business Intelligence — H.Tonguç Yılmaz @ 2:06 pm

Today is my 33rd birthday and I just wanted to make a quick update that I am still alive :)

Here is a link I received from Oracle today which aggregates available blogs and forums on Oracle’s Integration Technologies; Oracle Data Integrator, Oracle Warehouse Builder and Oracle Goldengate: http://www.netvibes.com/01badger#General

I hope you like it as I do, about blogging and forums I think I won’t be able to blog like the old days anymore, hours are really not enough to catch up with the management issues parallel to my new responsibility. But things always may change as we all know :)

Until next time I hope to see you around at my friendfeed.

September 7, 2009

PPT: Data Warehousing with Oracle Database 11g Release 2

Filed under: Oracle Business Intelligence — H.Tonguç Yılmaz @ 8:10 am

There are several exciting new features with 11.2 on BI topics. But it is sad to see hybrid columnar compression NF at beta and now to know that it is only available with Exadata: ORA-64307: hybrid columnar compression is only supported in tablespaces residing on Exadata storage

Still there are very very interesting NFs especially related to PQ, ASM and Partitioning, and here is a summary presentation you may want to check.

ps: Also here you can find some 11.1 new features summary presentations.

August 25, 2009

Materialized views and sys.sumdelta$ UPDATE-DELETE performance

Filed under: Oracle Business Intelligence,Oracle Performance — H.Tonguç Yılmaz @ 5:08 pm

On our ODS environment(HP-UX IA 64-bit & Oracle Database EE 11.1.0.7.0 64bit) we have several materialized views whose base tables are loaded with direct path insert(nologging segments+APPEND hint) and these materialized views are refreshed with these options:


dbms_mview.refresh( v_schema_name || '.' || v_mv_name,
		method => 'C',
		atomic_refresh => FALSE,
		parallelism => pin_parallelism );

Lately I started to see sys.sumdelta$ table and its index at our hottest segments list on AWR reports and v$segstats.


Segments by DB Blocks Changes DB/Inst: ODS/ods Snaps: 9159-9166
-> % of Capture shows % of DB Block Changes for each top segment compared
-> with total DB Block Changes for all segments captured by the Snapshot

Tablespace Subobject Obj. DB Block % of
Owner Name Object Name Name Type Changes Capture
---------- ---------- -------------------- ---------- ----- ------------ -------
SYS SYSTEM I_SUMDELTA$ INDEX 4,386,512 21.95
SYS SYSTEM SUMDELTA$ TABLE 2,921,600 14.62
ODS SCRATCH_BT I_SNAP$_SBL_CONTACT INDEX 2,555,296 12.78
ODS SCRATCH_BT I_SNAP$_SBL_PARTY_PE INDEX 1,994,592 9.98
ODS SCRATCH_BT I_SNAP$_MV_SBL_ACCOU INDEX 1,924,304 9.63
------------------------------------------------------------- 

Segments by Buffer Busy Waits DB/Inst: ODS/ods Snaps: 9159-9166
-> % of Capture shows % of Buffer Busy Waits for each top segment compared
-> with total Buffer Busy Waits for all segments captured by the Snapshot

Buffer
Tablespace Subobject Obj. Busy % of
Owner Name Object Name Name Type Waits Capture
---------- ---------- -------------------- ---------- ----- ------------ -------
SYS SYSTEM SUMDELTA$ TABLE 128,275 49.75
SYS SYSTEM I_SUMDELTA$ INDEX 86,186 33.43
SYS SYSTEM JOB$ TABLE 11,130 4.32
ODS SCRATCH_BT I_SNAP$_SBL_CONTACT INDEX 8,112 3.15
ODS SCRATCH_BT I_SNAP$_MV_SBL_ACCOU INDEX 3,968 1.54
-------------------------------------------------------------

This table was getting larger and larger after each base table load for each materialized view and below two queries were called recursively each time:


SQL> select sum(bytes)/(1024*1024) MB from dba_segments
2 where owner = 'SYS' and segment_name = upper('sumdelta$') ;

MB
----------
811

SQL> select count(*) from sys.sumdelta$ t ;

COUNT(*)
----------
9345556

SQL> SELECT plan_table_output
2 FROM TABLE(dbms_xplan.display_cursor('9sbsudzr87ztj')) ;

-------------------------------------
update sys.sumdelta$ s set s.timestamp = :1, s.scn = :2 where
s.tableobj# = :3 and s.timestamp >=
to_date('4000-01-01:00:00:00','YYYY-MM-DD:HH24:MI:SS')
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|Time
--------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | | | 3 (100)|
| 1 | UPDATE | SUMDELTA$ | | | |
|* 2 | INDEX RANGE SCAN| I_SUMDELTA$ | 15758 | 1107K| 2 (0)|00:00:01
--------------------------------------------------------------------------------

SQL> SELECT plan_table_output
2 FROM TABLE(dbms_xplan.display_cursor('9wb7u7x708yvr')) ;

-------------------------------------
DELETE FROM sys.sumdelta$ sd WHERE sd.tableobj# = :1 AND sd.timestamp less than :2
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|Time
--------------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | | | 6189 (100)|
| 1 | DELETE | SUMDELTA$ | | | |
|* 2 | INDEX RANGE SCAN| I_SUMDELTA$ | 1669 | 98471 | 6189 (1)|00:01:27
--------------------------------------------------------------------------------

After a small investigation since execution plan showed that two queries used dynamic sampling, I found that this sys table is excluded from gather_stats_job. Since expected behavior is after refreshes the number of rows at this dictionary table should reduce we decided to open an SR. After some more tests I saw that neither fast refresh nor complete refresh made any difference regarding sumdelta$ table’s number of rows at our environment.

For SR we provided the usual 10046 aspirin and saw that all trace files provided from 10046 event output(as per being run in parallel there is QC and slaves trace files) there were no sumdelta$ reference at all. So this guided us to believe that this dictionary table not getting purged can be a bug.


ALTER SESSION ENABLE PARALLEL DDL ;
ALTER SESSION ENABLE PARALLEL DML ;

-- create table sr_test nologging parallel 8 as select * from src_siebel.s_contact where 1=2;
-- create materialized view mv_sr_test enable query rewrite AS SELECT EMP_NUM, CREATED FROM sr_test ;

Select /*+ parallel(sm,8) parallel_index(sm,8) */ count(*) from sys.sumdelta$ sd ;

COUNT(*)
----------
2601080

insert /*+ append */ into sr_test select * from src_siebel.s_contact where rownum  'C', parallelism => 8, atomic_refresh=>false);

PL/SQL procedure successfully completed

commit ;

select sysdate, user from dual;
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';

Select /*+ parallel(sm,8) parallel_index(sm,8) */ count(*) from sys.sumdelta$ sd ;

COUNT(*)
----------
2601188

We started to look for some immediate workaround options, dropping and recreating the materialized views did not reduce the number of rows on the dictionary table. So we decided to leave using materialized views and change them with create table as selects, we started from the most long refreshing ones and 3-4 hours refresh times decreased to minutes with this workaround. But of course query rewrite was the alternative cost of this action, we needed to go over all report queries and change them to use the new tables.

We also tried to truncate the dictionary table first at dummy clone databases and then at our development environment, until now there were no problems experienced, so this truncate can also be scheduled weekly for example. As a result until this bug is fixed we feel lucky to have at least two different workaround options, sometimes things get much more uglier :)

July 3, 2009

Interval Partitioning New Feature and Logging Option Problem

Filed under: Oracle Business Intelligence — H.Tonguç Yılmaz @ 4:51 am

One of my old friends who is an Oracle DBA for at least 15 years always argued with me that “every new feature of Oracle is an evil!” :) He advised that it is not secure to upgrade a production environment to 10gR2 for example unless 11gR2 is released on your platform. Personally I always loved new features and tried to test them as soon as possible. I accept and also experience that there are always bugs around new features especially when they are complemented with some other feature but this is why we have Oracle support.

Recently on our 11.1 ODS production database we saw that the automatic created partitions do not inherit the LOGGING property of neither the table nor the tablespace it belongs to where as it is created always with LOGGING on. If you are trying to benefit from direct path inserts and your database is running on ARCHIVELOG mode and the target segments are LOGGING you will be failing to benefit and producing lots of redo and undo information depending on the size of data you are inserting. So an SR opened, bug escalated, a typical process.

Until 11.1 we had our own interval partitioning maintenance scripts and still we stick to them especially because the automatically created partitions’ naming. This is something I think will be evolving with the future releases since anybody who uses partitioning feature love to name their partitions and is not happy with anonymous partition names. But there is no evil with this and it is the life cycle of any software I think; a new feature comes out – some people test it/consume it and find bugs or report additional feature requests – the producer handles these within future releases or patches.

ps: for exchange of anonymous named partitions you may use EXCHANGE PARTITION FOR clause as a workaround but still with several limitations, you may check this post for more details.

April 3, 2009

Getting Started and Extending your Knowledge with Oracle Warehouse Builder 11g

Filed under: Oracle Business Intelligence — H.Tonguç Yılmaz @ 1:08 pm

I already mentioned here and here that I love Apex and OWB since they are Oracle database integrated development environments, this is simply because I am an old school SQL and PL/SQL minded person and these two development environments make you feel more powerful, which is IMHO a really cool feeling during chatting with the Java/SOA guys :)

I have been using OWB 11g for some time now and I want to prepare a getting starting material to assist my colleagues. Apex has a two-days developer guide for this purpose, and OWB has its fantastic OTN OBEs.

If you are already aware of the BI fundamentals and Oracle database fundamentals you can go through these 16 11g OWB OBEs within a day or two and than I am pretty sure you will get the feeling I mentioned. As you may know with 11gR1 now OWB and Apex are a part of the standard database installation, so if you have a 11g installation you already have an OWB practice environment, following the first OBE will be enough to initialize your OWB inside your 11g installation.

These OBEs do not only include typical practices but also information on Warehouse Builder 11g architecture, components, BI concepts like Star Schema and Snowflake Schema or Slowly Changing Dimensions. Also here is an additional viewlet which may give you an idea about OWB. Remember what actions you do inside your mappings they will all deploy a standart PL/SQL package, these old friends are your ETL executables for your each BI need. And OWB is a cost-free(database-licence included) option and it is a highly optimized ETL tool which uses Oracle’s SQL capabilities.

And as some readers may be thinking since Oracle bought ODI OWB will be dead soon, no this not true and you may read the future strategy from OWB’s corporate blog here, to my understanding OWB and ODI will be a part of a combined BI solution strategy together, like a coffee and milk together, so this is not a coffee or a tea choise :)

Next Page »

Create a free website or blog at WordPress.com.