Extract(Read) > Transform(Test&Apply) > Load(Learn) Blog

March 28, 2008

Number data type formating with sql*plus – numwidth hint

Filed under: Oracle How To — H.Tonguç Yılmaz @ 7:23 am

After all those years I still love to see I am learning lots of new stuff each day. This is what makes you fit and motivates to go deeper with Oracle I guess. In my opinion especially small things that you forget or even somehow you didn’t still read about makes the difference, here is today’s example :)

-- until this morning I was also using column formating for large numbers like this
drop table tt purge;
create table tt (col1 number(18), col2 number(18));

insert into tt values (123456789123456789,123456789123456789);

select * from tt;

      COL1       COL2
---------- ----------
1,2346E+17 1,2346E+17

col col1 for 999999999999999999.9999999999
select * from tt;

                          COL1       COL2
------------------------------ ----------
 123456789123456789.0000000000 1,2346E+17

-- but from this morning an OTN forum thread reminded me SQL*Plus's numwidth option :)
drop table t purge;
create table t (col1 number(18));

insert into t values (123456789123456789);

show numwidth
numwidth 10
select * from t;

      COL1
----------
1,2346E+17

set numwidth 18
select * from t;

              COL1
------------------
123456789123456789

Thanks to Eduardo Legatti for this hint.

March 26, 2008

Graham Wood and Jonathan Lewis will be in İstanbul next month!

Filed under: Oracle Seminars — H.Tonguç Yılmaz @ 3:39 pm

I saw Marco Gralike’s announcement so I also wanted to make a similar one. But this time Graham Wood will visit my company and we planned to be together with Wood for a half day on 21th of April.

Also on 22nd of April Jonathan Lewis will be in İstanbul, this time for his three days seminar. You can find the details of this announcement here. Below are my favorite links for summaries of Lewis’s previous seminars;

https://tonguc.wordpress.com/2007/03/01/another-great-undocumented-oracle-hint/

http://orajourn.blogspot.com/2007/11/day-1-with-jonathan-lewis-execution.html

http://orajourn.blogspot.com/2007/11/day-2-with-jonathan-lewis-writing.html

http://orajourn.blogspot.com/2007/11/day-3-understanding-and-assisting-cbo.html

March 25, 2008

How to get cumulative logons and sessions highwater mark for an Oracle database?

Filed under: Oracle How To — H.Tonguç Yılmaz @ 12:30 pm

It is always good to have an RDA, Statspack or AWR report to let a database to introduce itself to you :)

Of course usually you will be needing much more information than this, especially related to the developed applications on that database.

This morning one of my colleagues asked how he can get cumulative logins and sessions high water mark for an Oracle database, so below may help;

set linesize 1000
column name format a25
column num_cpus format a10
column db_block_size format a15
column requests_for_data format 999999999999999999

SELECT to_char(open_time,
               'DD-MON-YYYY HH24:MI:SS') last_startup,
       SYSDATE - open_time days_passed_from_last_startup,
       vp1.VALUE num_cpus,
       vp2.VALUE db_block_size
  FROM v$thread, v$system_parameter vp1, v$system_parameter vp2
 WHERE instance = 'bscsdb1'
   AND vp1.NAME = 'cpu_count'
   AND vp2.NAME = 'db_block_size';

LAST_STARTUP         DAYS_PASSED_FROM_LAST_STARTUP NUM_CPUS   DB_BLOCK_SIZE
-------------------- ----------------------------- ---------- ---------------
25-MAR-2008 03:24:20                    ,459479167 64         8192

SELECT MAX(d.NAME) db_name,
       SUM(s.VALUE / (24 * 60 * 60 * (SYSDATE - t.open_time)) + 0.001) tps
  FROM v$database d, v$sysstat s, v$thread t
 WHERE s.NAME IN ('user commits', 'transaction rollbacks');

DB_NAME          TPS
--------- ----------
BSCS      151,125446

SELECT SUM(s.VALUE) requests_for_data
  FROM v$sysstat s
 WHERE s.NAME IN ('db block gets', 'consistent gets');

  REQUESTS_FOR_DATA
-------------------
        13120866284

SELECT SUM(s.VALUE) db_time FROM v$sysstat s WHERE s.NAME IN ('DB time');

   DB_TIME
----------
 639913103

SELECT sessions_highwater FROM v$license;

SESSIONS_HIGHWATER
------------------
              5581

SELECT NAME, VALUE FROM v$sysstat WHERE NAME LIKE '%logon%';

NAME                           VALUE
------------------------- ----------
logons cumulative            1102893
logons current                  5434

March 21, 2008

Oracle Change Data Capture under magnifier – Async CDC with 10gR2

Filed under: Oracle How To,Oracle Information Integration — H.Tonguç Yılmaz @ 1:02 pm

CDC is perceived like an aspirin in my organization, so why is that? Simply here the need is sharing and integration of information among a multitude of applications and databases and CDC make this data flow. Another variation of this need is instead of taking full extractions and calculating the change during data warehouse’s ETL process, CDC enables just to extract the changes from source databases. I guess this is why CDC is in Data Warehousing Guide as Chapter 16 :)

cdc-for-dwh.png

Sync methods like trigger or materialized view log to capture the change on tables have important performance affects on source systems. Below is a simple Sync CDC demo with a Materialized View Log.

Code Listing 211a – Sync CDC demo with Materialized View Log

Oracle Streams(Logminer+Advanced Queues(AQ)) is the technology behind Async CDC, it automatically;
- captures database changes at a source database,
- stages the changes,
- propagates the changes to one or more destination databases,
- and then applies the changes at the destination database(s).

All these are handled including both DML and DDL changes. And Oracle Streams integration can be;
- Within an Oracle database,
- Between two Oracle databases,
- Among multiple Oracle databases,
- Between an Oracle database and a non-Oracle database.

In Oracle the committed changes to data that take place in a database are recorded in the database redo log files. The Streams capture process extracts these changes from the redo log files(or archived log files) and formats each change into a logical change record(LCR). The LCRs are then stored(staged) in a queue. Next, Streams propagates LCRs from one queue(the producer queue) to another(the consumer queue) automatically and you can then apply(or consume) the LCRs from the consumer queue to the destination database with custom pl/sql applications depending on your need or want.

After 10g Release 2, Asynchronous Distributed CDC can be used against Oracle 9i Release 2 and higher sources. And there is enhanced CDC PL/SQL Control APIs(dbms_cdc_subscribe, dbms_cdc_publish) support for Async CDC setup and configuration. WE have two options after 10gR2;

Option 1 – Async Distributed Hotlog CDC Flow
a. Database Objects Record Changes
b. LGWR Log Changes to Online Redo Log
c. Capture Process Enqueue LCRs on local Streams Queue
d. LCRs are Propagated to target Streams Queue
e. Dequeue LCRs from target Streams Queue and Apply Changes to target Database Objects

Option 2 – Async Autolog CDC Flow
a. Database Objects Record Changes
b. LGWR Log Changes to Online Redo Log
c. ARCn Write Redo Data to local Archived Log File
d. Archived Log File is copied to target
e. Capture Process Enqueue LCRs to target Streams Queue from transferred Archived Log File
e. Dequeue LCRs from target Streams Queue and Apply Changes to target Database Objects

Important prerequisites for Async CDC are;
a. Source database must be in ARCHIVELOG mode,
b. the change capture table must be altered and kept logging or at database level FORCE LOGGING must be enabled for NOLOGGING operations. FORCE LOGGING on tablespace level may be preferred in some cases.
c. Supplemental Logging is enabled at database level or table level;

-- To enable minimal supplemental logging
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;
-- other supplemental logging
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (UNIQUE) COLUMNS;
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (FOREIGN KEY) COLUMNS;
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
-- to drop supplemental logging
ALTER DATABASE DROP SUPPLEMENTAL LOG DATA;
-- to control supplemental logging
SELECT supplemental_log_data_min, supplemental_log_data_pk,
supplemental_log_data_ui, supplemental_log_data_fk, supplemental_log_data_all
FROM v$database;

d. Async CDC requires at least version 9206 on source database,
e. Autolog Async option which reduces the mining and queuing performance impact on source requires Oracle and OS versions to be same,
f. some init.ora settings are mandatory, and GLOBAL_NAMES=TRUE is advised by documentation. But most probably this setting is not exceptable at source because you need to make sure the name of the database link matches the name of the target database exactly after this setting. As a workaround a database logon trigger can be used;

CREATE OR REPLACE TRIGGER cdc_trig
AFTER LOGON
ON DATABASE
DECLARE
sqlstr VARCHAR2(200) := 'ALTER SESSION SET GLOBAL_NAMES=TRUE';
BEGIN
IF (USER IN ('USR1', 'USR1')) THEN
execute immediate sqlstr;
END IF;
END cdc_trig;
/

Based on Mark Van de Wiel’s great Async CDC Cookbook here is Async Distributed Hotlog CDC option in action between a HP-UX B.11.23 U 10g Enterprise Edition Release 10.2.0.2.0 – 64bit as source and SunOS 5.10 – 10g Enterprise Edition Release 10.2.0.3.0 – 64bit as target.

Code Listing 211b – Async Distributed Hotlog CDC option in action

Before my final words here is a brief summary for the above demonstration;

PRE SETUP STEPS -
A – required database parameters are set on both databases,
B – cdc admin users are created with requested grants on both databases,
C – dblinks are created on both databases,
C – force logging and supplemental logging is enabled at source only.

SETUP STEPS -
A – create_hotlog_change_source step at target,
B – create_change_set step at target,
C – create_change_table step at target,
D – alter_change_set step at target,
E – alter_hotlog_change_source step at target,
F – create_subscription step at target.

POST SETUP -
A – change test at source,
B – capture control at target.

And my final notes are as follows;
a. CDC performance and restrictions are mostly related to Streams feature,
b. Direct path load and nologging statements on the operational source database is a threat which is usually a rare situation on a traditional OLTP environment,
c. There are documented bugs and patches on metalink, so testing Async CDC for your own need is highly recommended;
Bug 4285404 – Streams propogation from 9206
Bug 4649767 – High system commit numbers(SCNs)
Bug 5024710 – Queue tables in System Tablespace
d. do not throw a mountain to kill a scorpion :) If you are able to easily capture change information through your application(some modification date columns for example) I advice you to go for them first.

Testing Information : the scripts mentioned are tested on Oracle Database 10g Enterprise Edition Release 10.2.0.3.0

References Used :

Asynchronous Change Data Capture Cookbook by Mr.Mark Van de Wiel – Feb 2006
Asynchronous Autolog Change Data Capture Cookbook by Mr.Mark Van de Wiel – Jan 2007
Mark Rittman’s article
Julian Dyke’s Streams and Supplemental Logging presentations
Streams and CDC demos at psoug.org
Streams related Oracle magazine article

March 20, 2008

My first video blog entry test – Oracle Quality and Performance for Free summary

Filed under: Oracle Other — H.Tonguç Yılmaz @ 6:19 am

After seeing Eddie Awad’s Oracle in 3 Minutes series I planed to play with the trial of Camtasia Player. Camtasia Player is a simple, standalone video player, optimized for playing screen recordings.

I receive lots of comments on why I am not writing in Turkish, my native language. So I decided to use this tool as an opportunity to summaries some of my most read posts. After a small introduction to the tool I directly tried myself on my Oracle Quality and Performance for Free summary. My first recording experience was really fun, but I know I have to study more on my voice and tool’s options like pointer style, resolution etc. for the possible future videos. I am not sure if our community has a video blog but after experiencing Camtasia for some of us I am sure writing will be harder :)

Download and extract this zip file to a directory and use oracle-for-free.html file to play the video. (size ~58 Mb and download password is “lnk_psswd”)

Last week I spend lots of my time on our 10g multi-terabyte billing systems’ migration from hp tru64 to ibm aix environment. Cross Plathform Transportable Tablespace database migration or Data Pump options were not excepted by the management because of their downtime requirements. And the project was based on a redo change data capture tool Golden Gate, I will be blogging about this method and its risks later.

And also I hope to study Oracle’s Warehouse Builder next week and blog about my experiences, with its new version OWB is much more strong now, not only for ETL purposes but also for our daily needs ;)

March 11, 2008

Alberto Dell’Era’s investigations and scripts on Oracle optimizer

Filed under: Oracle Performance — H.Tonguç Yılmaz @ 7:18 am

When it comes to Oracle Optimizer Jonathan Lewis, Christian Antognini and Wolfgang Breitling have the best resources I have ever read. And now I will add Alberto Dell’Era to above three – http://www.adellera.it/investigations/index.html

ps: if you still didn’t see this video presentation take a look ;) Optimising through Understanding by Jonathan Lewis

A little more on External Tables

Filed under: Oracle Business Intelligence,Oracle How To,Oracle Performance — H.Tonguç Yılmaz @ 5:27 am

I have written on external tables before;
External table’s performance for a row and undocumented Oracle functions
Unload Data with External Tables and Data Pump

This time last week several questions from our ETL group guided me to test a little more external tables on Solaris 10 and 10g Release 2. Some of the interesting questions were;

a- can we read compressed text data over external tables,

b- can we read data over symbolic linked(ln -s) files with external tables,

c- what is the performance difference of accessing an external table compared to a heap organized and partitioned table, what are the access paths and join methods.

So let me share with you some of my findings parallel to the above questions;

a- Since the extracted files were so huge and needed so much extra disk space to uncompress this was needed. Unix pipes are like magic, for example you can export to a unix pipe and read import data through the pipe, here is an example; http://www.orafaq.com/faqunix.htm#EXPORT

Within an external table creation ddl it is possible to use a unix pipe instead of a file and you can start an uncompress process to the pipe before reading the external table with dbms_scheduler for example. But external tables only work with text files as inputs directly and this is documented; http://download.oracle.com/docs/cd/B19306_01/server.102/b14231/tables.htm#sthref2448

b- The ETL tool we used created the extracted text files under different folders and each partition’s data with the same file name. So in order to read this information over an external table even they are all uncompressed we needed to create some symbolic links. But as I linked the documentation above even the files were functional over symbolic links at operating system, Oracle created the external tables without any error but when we tried to access the data KUP-04001: error opening file error was inevitable.

Anyway, god bless sed and awk for getting this linking script output so easily!

Code Listing 208a – AWK and SED magic to create ln -s script from an Ab initio map file

Also here are resources for awk and sed on OTN;
AWK: The Linux Administrators’ Wisdom Kit By Emmett Dulaney
Using the sed Editor By Emmett Dulaney

c- The access path’s for joining two external tables or an external table with a heap organized table were HASH JOIN BUFFERED for equality conditions and SORT-MERGE JOIN for non-equality conditions, nothing unexpected here. I used SQL*Plus’s timing and autotrace features also Kyte’s runstats package during my comparisons. With autotrace statistics recursive calls and consistent gets were major differences where as with runstats outputs pga memory max, session pga memory, DB time statistics and process queue reference latch was higher with external tables and table scan rows gotten, physical read bytes statistics were very higher with heap organized tables. Still the elapsed time of heap organized tables were ~5 to 10 times less for the same operations compared to the external tables.

Code Listing 208b – Simple performance tests with External Tables

As a summary I may easily advice that external tables are not best to access several times but they are here for our Create Table as Select(CTAS) statements and compared to SQL*Loader the Select part of the CTAS is much more flexible for our transformation needs. Also IMHO tools perform well only when they are used for they were advised and designed to, so feeling unsatisfactory when experiencing an external table not reading through compressed files is really funny to me.

Testing Information : the scripts mentioned are tested on Oracle Database 10g Enterprise Edition Release 10.2.0.3.0

References Used :
Managing External Tables

March 8, 2008

Hundreds of reasons why you might want(or have) to use R-Man!

Filed under: Oracle 10g New Features,Oracle How To — H.Tonguç Yılmaz @ 1:26 pm

When I was a kid I loved to watch He-Man, The Most Powerful Man in the Universe. Yes, R-Man is the most powerful database backup and recovery tool in the universe for today. But still there are some DBAs do not want use RMAN. They are the same crowd who also do not use Database or Grid Control, Automatic Storage Management(ASM), Automatic PGA Management and resist to migrate to 10g Release 2. Sometimes this resistance drives me nuts because the resistance need to create lots of myths to dirty all what they are afraid of.

For RMAN two common used alternatives are file system backup and storage vendor’s snapshot capabilities which usually cost a fortune. For me they can not be any alternatives for RMAN but in this post I won’t be questioning these methods versus RMAN like how they can do Block Level Recovery, Duplicate or Validate a backup with a simple command etc. But I will provide two examples since the resistance believed these two are myths of RMAN :)

1. Resistance cries out : “RMAN is backing up only used blocks is a myth!” and R-Man answers them ;)

RMAN does not backup empty blocks which are above the high watermark and not formated, this feature is called NULL COMPRESSION. After 10g Release 2 RMAN also does not backup empty blocks below the high watermark and this new feature is called UNUSED BLOCK COMPRESSION.

Code Listing 207a – RMAN UNUSED BLOCK COMPRESSION DEMO on Linux EE 10.2.0.3

The resistance used 10g XE for a similar demonstration and this feature does not function on XE, this was enough for them to built their own “scientific” myth. I love XE but IMHO XE is not the place to do these kinds of testing.

Code Listing 207b – RMAN UNUSED BLOCK COMPRESSION DEMO on Windows XE 10.2.0.1

Just a complementary note on Automatic Segment Space Management(ASSM) and high watermark concept; ASSM was introduced with Oracle 9i Release 2 and it is default segment management when you create a new tablespace after 10g Release 2. ASSM introduced two new high watermarks;
- High High Watermark(HHWM); all blocks above HHWM are unformatted,
- Low High Watermark(LHWM); all blocks below LHWM are formatted.
as a result now we can have unformatted blocks in the middle of a segment.

2. Resistance cries out : “During RMAN back up compared to file system based backups redo generation is NOT less!” and R-Man answers them ;)

If a tablespace is in backup mode Oracle will stop updating its file headers but will of course continue to write data to the database files. In backup mode Oracle will write out complete changed blocks to the redo log files with the first DML on that block. Where as normally only changes on a block are logged to the redo logs. This is done to enable the reconstruction of a block if only half of it was backed up, so the result is increased log activity and archiving during on-line backups.

Code Listing 207c – Redo Generation Cost of Backup Mode Demo

RMAN does not put tablespaces in backup mode, so there is no redo generation overhead like shown in above demo. RMAN will re-read database blocks until it gets a consistent image of it.

For much more of R-Man’s strengths after 10g Release 2, like incremental merge of ASM based FRA(disk) backup, faster incremental backups with Block Change Tracking and Backup Compression please check these two resources;
Faster Backup, Faster Recovery By Arup Nanda

Oracle Database 10g Release 2 New Features in Backup and Recovery

Testing Information : the scripts mentioned are tested on Oracle Database 10g Enterprise Edition Release 10.2.0.3.0

References Used :
http://www.oracle.com/technology/pub/articles/10gdba/week9_10gdba.html
http://download.oracle.com/docs/cd/B19306_01/backup.102/b14191/osbackup004.htm#sthref1489
RMAN does (not?) backup empty blocks | what is NULL COMPRESSION ? | what is BINARY COMPRTESSION ?| what is UNUSED BLOCK COMPRESSION ? « lutz hartmann as sysdba
http://www.oraclealchemist.com/oracle/hot-backups-redo-and-fractured-blocks/

March 6, 2008

10g’s ORA_HASH function to determine if two Oracle tables’ data are equal

Filed under: Oracle How To — H.Tonguç Yılmaz @ 12:18 pm

Before 10g instead of ORA_HASH SQL function we used DBMS_UTILITY.GET_HASH_VALUE PL/SQL function for this need, of course with its performance cost. If you want to compare two table’s meta or data there are several open source options you may already be aware of;
– a SQL based row by row comparison
http://www.dbspecialists.com/scripts/compare_data.sql
– a SQL based metadata comparison
http://www.dbspecialists.com/scripts/compare_schemas.sql
– a summary of other options
http://dgielis.blogspot.com/2006/01/compare-2-oracle-schemas.html

But what if your tables are in different databases(network cost of database links) and they are in size of some 100s of GBs each? If you do not have any large object type columns and 4294967295 maximum bucket size for ORA_HASH is enough for you to avoid hash collusion then this can be one of the best(accurate and fast both) solutions(I guess, any comments?).

Code Listing 206a – 10g’s ORA_HASH function to determine if two Oracle tables’ data are equal
Code Listing 206b – A simple hash collusion example with dbms_utility

[ NEW UPDATES MADE on April 13 2008 ]

During a row by row migration(unload on source>ftp text to traget>load texts parallely on target or like in our case insert into select from over dblink plus apply changes captured by a CDC tool on production during this logical data transfer kind of strategies) a validation need becomes very important.

This validation must involve three major parts in my point of view;

1- the metadata validation

In this part you must ensure that all objects and their statuses(constraints, triggers etc.), attributes(comments on tables etc.) are transfered exactly as they were in the source. Here you may also get assistance from Schema Comparison plugins of favorite Oracle database development IDEs like PL/SQL Developer and Toad.

2- the data validation

In this part you must ensure that all rows are transfered as they were in the source. ORA_HASH is a fast and accurate tool here as mentioned above. During this kind of a validation first thing to be sure about is the NLS settings of your source and target of course;

from SQL*Plus compare the outputs of:
SHOW PARAMETER NLS_LANGUAGE
SHOW PARAMETER NLS_TERRITORY
SHOW PARAMETER NLS_LENGTH_SEMANTICS
SELECT name, value$ from SYS.PROPS$ WHERE name = ‘NLS_CHARACTERSET’;

from Unix shell compare the outputs of:
env | grep NLS

Below is a sample application to create the validation scripts from the parameters given dynamically.

A data validation framework for 10g with ORA_HASH

If you want to use ORA_HASH previous releases of 10g you can create a temporary 10g instance near the 9i database host and use dblinks.

A data validation framework for 9i with ORA_HASH

3- the performance validation

After transfer validated successfully since this is a logical(row-by-row) transfer you must ensure that any possible performance degradation is NOT in place. New Stats for both objects and system must be gathered. Here ASSM tablespace management needs more close attention since it works to reduce the concurrent insert contention and as a result clustering factor of your indexes may change unexpectedly which may harm especially the costs of your index range scans and as a result your execution plans may change after migration. So you may prefer to use append hint but not with a parallel hint for your inserts during migration.

For more information on this topic please see Cost-Based Oracle Fundamentals book by Jonathan Lewis, Chapter 5 – The Clustering Factor(we are lucky since the related chapter is available for download here :)

Testing Information : the scripts mentioned are tested on Oracle Database 10g Enterprise Edition Release 10.2.0.3.0

References Used :
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:26043356526642
http://www.freelists.org/archives/oracle-l/03-2006/msg00812.html
Ertürk Diriksoy‘s comment on my email at group Kahin_Tartisma
http://psoug.org/reference/ora_hash.html

March 3, 2008

Oracle 10g New Features for Administrators by Ahmed Baraka

Filed under: Oracle 10g New Features — H.Tonguç Yılmaz @ 9:55 am

Especially people who are preparing for Oracle 10g certification exams will like this quick note I guess. One of my colleagues shared this work of Mr.Ahmed Baraka on 10g New Features for Administrators last weekend and I liked it very much so I also wanted to share it here. For more Baraka’s work you may visit http://ahmedbaraka.com/computer/index.htm

And below are quick reminders for 10g NFs -
http://download.oracle.com/docs/cd/B19306_01/server.102/b14214/toc.htm
http://www.oracle.com/technology/pub/articles/10gdba/index.html
http://www.oracle-base.com/misc/OCPCertification.php
https://tonguc.wordpress.com/category/oracle-10g-new-features/

Next Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 76 other followers