Extract(Read) > Transform(Test&Apply) > Load(Learn) Blog

August 30, 2009

Discussing 11.1 Oracle Database Concepts in Turkish at oraclecommunity.net

Filed under: Oracle Concepts and Architecture — H.Tonguç Yılmaz @ 4:26 pm

If you want to read and/or contribute you can join the OracleTURK group at oraclecommunity.net, the discussions are here:

http://www.oraclecommunity.net/group/OracleTURK/forum

Note: ning is a facebook like place, so IMHO this is a good opportunity for Turkish speaking Oracle database people to meet and become friends.

Advertisements

June 1, 2009

Event-Driven Applications: Introduction to Oracle EDA Suite

Filed under: Oracle Concepts and Architecture — H.Tonguç Yılmaz @ 5:40 am

Last week I was reading IBM’s Real-Time Data Analysis announcement, so this triggered me to ask local Oracle office if they have a similar application. I am pretty aware of the features at database side, thanks to Concepts Guide and tahiti search. But especially within the last years with Oracle’s purchases the features now outside the database Oracle has is an important question mark for me.

And quickly the answer came: Oracle Event Driven Architecture and there Complex Event Processing feature(aka BEA Weblogic Event Server).

Even I am a database guy I really liked the application until now, maybe it is because event processing language(EPL) is very similar to SQL, it is Eclipse based(at least for now), Jrockit JVM can promise deterministic garbage collection, its hot deployment capabilities and its monitor->analyze->act included business activity monitor application(business user oriented dashboards and alerts without writing a single line of code). We planned to do some PoC so maybe I will be sharing some of my experiences with this product later on. To make you more interested here are some concepts of this product:
• Extend the relational model to support “continuous” query
• Construct “windows” over event streams
– Bounded by time or count
– Partitioned based on values
– Processed incrementally or in batches
– Defines a “working set” of events to then apply queries to
• Use relational operators to characterize the event streams
– Filtering, aggregation, and correlation of events
• Add pattern matching
– Supports track-and-trace scenarios (e.g. detecting missing events)

Pattern Matching Example(Defined as regular expressions, In-Memory Continuous Queries): Stock Trading “W” Pattern


	SELECT FIRST(x.time), LAST(z.time)
	FROM ticker MATCH_RECOGNIZE (ONE ROW PER MATCH PARTITION BY name
		PATTERN (X+ Y+ W+ Z+)
			DEFINE X AS (price < PREV(price)) 			
                                   Y AS (price > PREV(price))
			           W AS (price < PREV(price)) 			
                                  Z AS (price > PREV(price)))

So here is an Introduction to Oracle EDA Suite presentation you may want to check for more details.

Conclusions: being aware of a feature you already purchased is very important not to re-invent a dumper wheel and in order to be time to market against competition. So I hope Oracle develops a Concepts Guide for its Fusion Middleware applications soon for people like me :)

October 4, 2007

Oracle Concepts and Architecture – Part 5

Filed under: Oracle Concepts and Architecture — H.Tonguç Yılmaz @ 6:00 am

Part 1 subjects –

1- Flashback Versions Query, Drop Table and Recyclebin Management
2- How shared pool works in Oracle and binding example
3- The Storage Hierarchy Summary in an Oracle Database
4- Concurrency and Isolation Levels Concepts
5- Automated Checkpoint Tuning and Mean time to recover(MTTR) Concepts
6- Concurrency and Consistency Concepts
7- Latch and Lock Concepts

Part 2 subject –

8- Oracle Database Architecture: Overview

Part 3 subjects –

9- How Atomicity is implemented on Oracle
10- NULL values and Oracle

Part 4 subject –

11- Overview of Transaction Management Internals

12- Concurrency versus Locking Concepts, Understanding Lock Contention with Examples

“Performance is interested with avoiding unnecessary work on limited system resources, where as Scalability is interested with avoiding blocking others doing same operations(serialization). These need different attentions, especially second is very important for heavy loaded OLTP systems. Sometimes you may sacrifice in one to gain in other.” Memory Management and Latching Improvements in Oracle9i and 10g Presentation by Tanel Põder

Concurrency and Locking are like Fenerbahce(imitates concurrency here of course :) and Galatasaray(imitates locking :), they have an immortal competition. It is very critical to understand Oracle’s answers for building highly-concurrent OLTP applications, you paid a lot for it right :) I mentioned the importance of testing for concurrency before, here with this post I want to focus on some conditions which reduce concurrency.

In Oracle you can develop highly concurrent database applications because;
1- Readers do not wait for readers,
2- Writers do not wait for readers,
3- Writers only waits for writers if they attempt to change the same rows at the same time,
4- Oracle automatically locks data at the lowest level of restrictiveness.

Below example demonstrates how updates and inserts on same row creates blocking locks – Code Listing 147a – Locks on same row update and insert example

There are some situations to be aware of –

A. Below example demonstrates that an update on row 101 waits for an update on row 100 of hr.employees table – Code Listing 147b – Lock acquisition and blocking lock example

Because lock acquisition works under a FIFO policy the SX lock can not be granted until the S lock has been released in the example.

B. Below example demonstrates that locks are converted under certain circumstances – Code Listing 147c – Lock conversion and blocking lock example

First session in the example has an SX lock both parent and child tables, but for the third session this time it has a SS lock on the parent table. After the update statement on the first session SX lock on the child table is not allowed to convert it and it hangs. The third session did not release the SX lock in order to get an SSX one, a conversion takes place.

C. Below example demonstrates how bitmap index causes blocking locks on different rows – Code Listing 147d – Bitmap index blocking locks on different rows example

Bitmap indexes are best for OLAP. But for OLTP, because of this locking behavior this option is an important subject to concurrent testing. Also check this related reference Bitmap Indexes by Jonathan Lewis

Testing Information : the scripts mentioned are tested on Oracle Database 10g Enterprise Edition Release 10.2.0.3.0

Refences Used :
Oracle® Database Concepts 10g Release 2 (10.2)
Chapter 4 Transaction Management
Chapter 13 Data Concurrency and Consistency

October 1, 2007

Oracle Concepts and Architecture – Part 4

Filed under: Oracle Concepts and Architecture — H.Tonguç Yılmaz @ 5:57 am

Part 1 subjects –

1- Flashback Versions Query, Drop Table and Recyclebin Management
2- How shared pool works in Oracle and binding example
3- The Storage Hierarchy Summary in an Oracle Database
4- Concurrency and Isolation Levels Concepts
5- Automated Checkpoint Tuning and Mean time to recover(MTTR) Concepts
6- Concurrency and Consistency Concepts
7- Latch and Lock Concepts

Part 2 subject –

8- Oracle Database Architecture: Overview

Part 3 subjects –

9- How Atomicity is implemented on Oracle
10- NULL values and Oracle

11- Overview of Transaction Management Internals

Transactions are one of the major features that set a database apart from a file system. Main purpose of them are to take the database from one consistent state to next, guaranteed that either all of your changes including various rules and checks that implement data integrity have been saved or none of them are saved when you commit your work. All statements are atomic.

I advice you also use Julian Dyke’s related great presentation in slide show mode during reading below simplified descriptions for much more deep informations on Transaction Internals with Oracle.

Overview of Transaction Management Internals

Undo segments are central to the database and they are used for both transaction rollback recovery and building cosistent read database operations.

a. A transaction begins when a slot is allocated in the transaction table at the head of an undo segment. The transaction identifier(txid=undo_segment_number.transaction_slot_id.SCN_wrap) is a pointer to this location.
b. Before a transaction iserts, updates or deletes a table row a Interested Transaction List(ITL) is allocated in the block containing the row. The ITL is used to mark the row as locked until the transaction is either committed or rollbacked. And the ITL contains the transaction identifier.
c. When the change is applied to the block undo information is also generated and is stored in the undo segment. The transaciton table slot contains a pointer to the undo information.

Read consistency

When a differet session wants to read the same block and the first trasaction has not been yet ended, the session reading the block(most possibly still in the cache) first finds the open ITL. It checks the transaction status by reading the undo segment header and finds that there is an active transaction still. This means the session must create a snapshot of the block and rollback all the changes made by the active transaction. This task is called making a consistent read(CR) copy of the block and is achieved by first cloning the block and then rolling back the latest changes by applying undo records from the udo segment. In this very simple explaination still CR requires I/O to data block, undo segment header and undo records.

Locking

This time lets suppose another session wants to change(update or delete) the row that was changed by the first transaction and the first trasaction has not been yet ended. This time when the session reading the block(most possibly still in the cache) first finds the open ITL. It checks the transaction status by reading the undo segment header and finds that there is an active transaction still, it starts to wait on the transaction table slot for the trasaction to complete(commit or rollback).

Commit and Delayed Block Cleanout(DLBC)

Now lets suppose the first active transaction is now committed, this is recorded immediately by marking the transaction table slot as inactive however the data block itself may not be updated until later meaning ITL in the block may remain open for some time after the commit. When Oracle read this block for some reason, the transaction table will be checked also, commit will be cofirmed and ITL this time is also closed. This behaviour is known as delayed block cleanout. Here if the undo segment has since been deleted the SCN recorded in undo$ is used to cofirm the commit.

Transaction Recover: Rollback – Process Crash(PMON) – Instance Crash(SMON)

When a rollback statement is issued the undo records for the current transaction in reverse order(latest first) is scanned and applied back. When rollback statement returns successfully this means all the block changes have been undone and the ITL has been cleared, there is no delay for rollback. This process alse generates new redo which are written to the redo logs.

If the server(shadow) process somehow crashes with an active transaction, PMON detects this and roll back the transaction. Event 10012 and 10246 can be set and used to monitor this.

If the instance crashes there is no chance to rollback the active transactions. The next time database is opened, SMON background process will do crash recovery and it will return the block to its pre-crash consistent state. Active transactions in the SYSTEM undo segment are immediately rolled back, where as active transactions in other undo segments are marked as “dead” and SMON scan these segments at a later time to perform rollback since the database need to start up as fast as possible. To list the dead transactions –
select * from x$ktuxe where ktuxecfl = ‘DEAD’ or
alter system dump undo header — here dead transactions are identified by having cflg = ‘0x10’

Problematic cases

Above operations need access to the undo segment header, undo records and related data blocks for ITL information. If any of these are corrupted or lost then operations will be affected. Event 10013 monitors transaction recovery during startup and Event 10015 dump undo segment header listed in undo$ before and after transaction recovery.

Also only as a last resort and with the assistance of Oracle support “_offline_rollback_segments” and “_corrupted_rollback_segments” hidden parameters can be used. When database opens a list of offline and corrupted undo segment numbers(USNs) are constructed. If you use these two hidden parameters you change the default behavþior mentioned above, the transaction table is not read during startup so the transacitons are not marked as dead or rolled back. The segments listed in the parameters are appeared as OFFLINE in dba_rollback_segs(undo$) and they can not be acquired for the new transactions. This is an unsupported state so media recovery is preferred instead.

Continue reading with part 5

Refences Used :
Oracle® Database Concepts 10g Release 2 (10.2)
Chapter 4 Transaction Management
Chapter 13 Data Concurrency and Consistency

September 17, 2007

Column retrieval cost in CBO calculation

Filed under: Oracle Concepts and Architecture,Oracle Performance — H.Tonguç Yılmaz @ 1:59 pm

Same database, same table, same column, same data, same statistics but where does this difference come from;
Code Listing 141 – Column retrieval cost in CBO calculation demo

Summary –
for column c1 : elapsed time – 00.07 / cost – 5104 / consistent gets – 18575
for column c2 : elapsed time – 00.10 / cost – 5105 / consistent gets – 26811

Joze Senegacnik in his Optimizing for Performance 2 Day Seminar at Istanbul, Turkcell Academy 23-23/08/2007 mentioned that each subsequently parsed column introduces a cost of 20 cpu cycles regardless if it will be extracted or not;

“The answer is the cost of LIO. A CBO trace (event 10053) shows a difference in CPU cost – additional 20 cycles for the second column. In my presentation “Access Path Optimization” presented at UKOUG, Collaborate06 I have a demo case with a table with 250 columns where the difference between selecting from the first and last column is almost 50% increased elapsed time with no PIO, only LIO.

It is important to know that the CBO, when using the new cost model which includes also the CPU time, is aware of the fact of increased CPU consumption for each column retrieved. In reality the kernel has to parse the row data due to the variable length of columns (either VARCHAR2 or NUMBER) and it adds 20 cycles per row retrieved. For instance: retrieving first column costs 20, retrieving tenth column costs 200 – and kernel has to walk through the row data across all columns to finally reach the tenth column – and this is the difference.”

Additional readings :
Optimizer debug trace event – 10053 trace file
How to migrate to system statistics with DBMS_STATS

August 17, 2007

Direct Path Load Operations versus Integrity Constraints, Triggers and Instance Recovery

Filed under: Oracle Concepts and Architecture,Oracle How To — H.Tonguç Yılmaz @ 8:02 am

Recently two different forum notes triggered me to write this post –
1. http://forums.oracle.com/forums/message.jspa?messageID=2012276
2. http://www.ceturk.com/forum/forum_posts.asp?TID=11131

Direct-path operation terminology in Oracle is used when a session is reading buffers from disk directly into the PGA(opposed to the buffer cache in SGA). During direct-path INSERT operations, the database appends the inserted data after existing data in the table. Data is written directly into datafiles, bypassing the buffer cache. Free space in the existing data is not reused, and referential integrity constraints are ignored. You may prefer to use a direct path load when you have a large amount of data to load quickly and you want to load data in parallel for maximum performance, but there are alternative costs to be aware of.

Below picture and definitions are related wait events from Performance Tuning Guide –
Listing 113 – Scattered Read-Sequential Read-Direct Path Read
* db file sequential read(single block read into one SGA buffer)
* db file scattered read(multiblock read into many discontinuous SGA buffers)
* direct read(single or multiblock read into the PGA, bypassing the SGA)

With the conventional path load method, arrays of rows are inserted with standard sql INSERT statements, integrity constraints and insert triggers are automatically applied. But when you load data with the direct path, SQL*Loader disables some integrity constraints and all database triggers. The constraints that remain in force are:
* NOT NULL
* UNIQUE
* PRIMARY KEY (unique-constraints on not-null columns)
and the following constraints are automatically disabled by default:
* CHECK constraints
* Referential constraints (FOREIGN KEY)

To start SQL*Loader in direct path load mode, we set the DIRECT parameter to true on the command line or in the parameter file. SQL*Loader log file describes the constraints that were disabled, the ones that were reenabled, and what error, if any, prevented reenabling or validating of each constraint etc. It also contains the name of the exceptions table specified for each loaded table.

In below demo you may see that sql*loader direct path load
– disables the foreign key constraint,
– ignores the foreign key constraint,
– ignores the trigger
where as direct path insert just ignores the foreign key constraint;
Code Listing 113 – Direct Path Load Demo

Because SQL*Loader writes directly to the database files, all rows inserted up to the last data save will automatically be present in the database files if the instance is restarted. Changes do not need to be recorded in the redo log file to make instance recovery possible. But if an instance failure occurs, the indexes being built may be left in an Index Unusable state. Indexes that are Unusable must be rebuilt before you can use the table or partition. If redo log archiving is not enabled(you are operating in NOARCHIVELOG mode), then media recovery is not possible of course. Also if the database or tablespace is in FORCE LOGGING mode, then direct path operation always produces redo, regardless of the nologging settings.

Testing Information : the scripts mentioned are tested on Oracle Database 10g Enterprise Edition Release 10.2.0.3.0

Refences Used :
Oracle® Database Utilities 10g Release 2 (10.2) – Chapter 11 Conventional and Direct Path Loads

February 11, 2007

Automatic Index Statistics Gathering and Minimizing Over Indexing

In order to gather statistics on an index we may use the Oracle supplied package DBMS_STATS:


SQL> Execute DBMS_STATS.GATHER_INDEX_STATS (‘HR’,’LOC_COUNTRY_IX’);
    

or gather statistics at index creation time:


SQL> Create index hr.loc_country_ix … compute statistics;
    

or gathering statistics when rebuilding an index:


SQL> Alter index hr.loc_country_ix rebuild compute statistics;
    

But after 10g even you do not mention to collect, statistics are collected by default;

Code listing 66a : Automatic Index Statistics Gathering with 10g Example

Indexing strategies are very important for OLTP applications. But over indexing affects DML operations. Oracle, if possible uses an already created index for primary or unique keys to minimize over indexing;

Code listing 66b : Minimizing Over Indexing Example

Also you may identify unused indexes beginning with Oracle9i. Statistics about the usage of an index can be gathered and displayed in V$OBJECT_USAGE. If the information gathered indicates that an index is never used after some time, the index can be dropped. Each time the MONITORING USAGE clause is specified, V$OBJECT_USAGE is reset for the specified index. The previous information is cleared, and a new start time is recorded;

Code listing 66c : Minimizing Over Indexing with Index Monitoring and V$OBJECT_USAGE

Testing Information : the scripts mentioned are tested on Oracle Database 10g Express Edition Release 10.2.0.1.0

Refences Used :
Oracle® Database Performance Tuning Guide 10g Release 2 (10.2)-Chapter 2 Designing and Developing for Performance

January 28, 2007

Somethings never change, be carefull

Filed under: Oracle Concepts and Architecture — H.Tonguç Yılmaz @ 9:24 pm

In one of my previous posts I wanted to attract attention to the “change”; Technology changes very fast, keep UNlearning. But some concepts do not change :)

In Oracle if you use Data Definition Language(DDL) in your code your transaction logic will be broken because there is an implicit COMMIT immediately before parsing and executing phases of the DDL;


Begin
   Commit; -- this breaks the transaction, whatever your DDL wiil result doesnt matter
   Parse DDL; -- syntax and privileges need to be controlled
   Begin
      Execute DDL;
      Commit;
   Exception
      When Others Then
        Rollback;
   End;
End;

Code listing 65a : Using DDL in a transaction Example

Another important concept in Oracle is full table scans always reads all the blocks under the high-water mark of the related segment.

Code listing 65b : Full Table Scan and High Water Mark Relation Example

Like DDL you must also be carefull with DELETEs in your applications, especially after huge DELETEs you need to perform CTAS or MOVE operations on the table if you need FTS performance. Or most probably it should better you use partitioning instead of this DELETE operations :)

The use of SQL in triggers may result excessive parsing problem, move SQL in the triggers into PL/SQL and call them from triggers;

Code listing 65c : Move SQL out of the trigger Example

Testing Information : the scripts mentioned are tested on Oracle Database 10g Express Edition Release 10.2.0.1.0

Refences Used :
Effective Oracle by Design by Thomas Kyte Chapter 5- Statement Processing, Page 264
Effective Oracle by Design by Thomas Kyte Chapter 8- Effective SQL, Page 453
Effective Oracle by Design by Thomas Kyte Chapter 5- Statement Processing, Page 299

January 13, 2007

Oracle Concepts and Architecture – Part 3

Filed under: Oracle Concepts and Architecture — H.Tonguç Yılmaz @ 6:29 am

In part 1 I have written on below subjects;

1- Flashback Versions Query, Drop Table and Recyclebin Management
2- How shared pool works in Oracle and binding example
3- The Storage Hierarchy Summary in an Oracle Database
4- Concurrency and Isolation Levels Concepts
5- Automated Checkpoint Tuning and Mean time to recover(MTTR) Concepts
6- Concurrency and Consistency Concepts
7- Latch and Lock Concepts

In part 2 I have written on below subject;

8- Oracle Database Architecture: Overview

On this post I want to discuss Atomicity and NULL values with Oracle.

9- How Atomicity is implemented on Oracle

One of the primary tests for DBMS reliability is what’s known as the ACID test. ACID stands for Atomicity, Consistency, Isolation and Durability. ACID-compliant systems are as close as you can get to guaranteed not to lose your data. Essentially, as long as your database files are intact, you are guaranteed that your data is consistent. Meaning of the word “atomic”: indivisible, and each transaction is “atomic”. Atomicity is a short-hand way of saying that when a transaction that changes the data occurs, either the whole transaction occurs, or none of the transaction occurs.

Atomicity is implemented in Oracle(and most other DBMSs) by commit and rollback statements. Oracle does not return from a COMMIT statement until the redo log changes corresponding to that commit have been physically written to disk, not just to buffer. Durability, the final property in the ACID test, means that any changes made by committed transactions are persistent: they are permanent, and if a failure occurs, the information in the database will reflect all transactions committed before the failure.

I love the atomicity examples on Mr.Thomas Kyte’s book transactions chapter, really improves understanding;


DROP TABLE t2 PURGE ;
CREATE TABLE t2 ( count INT );
INSERT INTO t2 VALUES ( 0 );

DROP TABLE t1 PURGE ;
CREATE TABLE t1 ( x INT CHECK ( x>0 ) );
CREATE TRIGGER t1_trigger
BEFORE INSERT OR DELETE ON t1
FOR EACH ROW
BEGIN
IF ( INSERTING ) THEN
UPDATE t2 SET count = count + 1;
ELSE
UPDATE t2 SET count = count - 1;
END IF;
DBMS_OUTPUT.PUT_LINE( 'I fired and updated ' || SQL%ROWCOUNT || ' rows' );
END;
/

SET SERVEROUTPUT ON;
INSERT INTO t1 VALUES ( 1 );
I fired and updated 1 rows

1 row created.

INSERT INTO t1 VALUES ( -1 );
I fired and updated 1 rows
INSERT INTO t1 VALUES ( -1 )
*
ERROR at line 1:
ORA-02290: check constraint (HR.SYS_C005726) violated

SELECT * FROM t2;

COUNT
----------
1

Initially the row in t2 was 0, after the first successful insert the row in t2 incremented by trigger to value 1. With the second insert 02290 error recieved, since the trigger is a part of the atomicity it is also rollbacked with the second insert statement and value on t2 table still remains 1. In this second example we will use a procedure to show the same behavior;


CREATE OR REPLACE PROCEDURE p AS
BEGIN
INSERT INTO t1 VALUES ( 1 );
INSERT INTO t1 VALUES (-1 );
END;
/

SELECT * FROM t1;

X
----------
1

SELECT * FROM t2;

COUNT
----------
1

BEGIN
p;
END;
/
I fired and updated 1 rows
I fired and updated 1 rows
BEGIN
*
ERROR at line 1:
ORA-02290: check constraint (HR.SYS_C005726) violated
ORA-06512: at “HR.P”, line 5
ORA-06512: at line 2

SELECT * FROM t1;

X
----------
1

SELECT * FROM t2;

COUNT
----------
1

This time also the first insert is rollbacked since the procedure is atomic. Ok this time lets see how exception handling affects the behaviour;


BEGIN
p;
EXCEPTION
WHEN OTHERS THEN NULL; -- kara delik
END;
/
I fired and updated 1 rows
I fired and updated 1 rows

PL/SQL procedure successfully completed.

SELECT * FROM t1;

X
----------
1

SELECT * FROM t2;

COUNT
----------
2

Since the constraint exception is handled, Oracle does not rollback this time, value on t2 table is 2. Be carefull with your exception handling strategy, these give flexablity to developer but it may break the atomicity if used uncarefully and cause corrupted data.

And especially “when others” exception block with “null” implementation must only be used for testing pusposes, it is as dangerous as a bomb in your code..

Another great example is this update statement for atomicity;


DROP TABLE t3 PURGE ;
CREATE TABLE t3 ( x INT UNIQUE );
INSERT INTO t3 VALUES ( 1 );
INSERT INTO t3 VALUES ( 2 );

UPDATE t3 SET x = x + 1;
commit ;

select * from t3;

X
----------
2
3

Because of the unique constraint Oracle would not be able to do this update if violation was checked after each row.

10- NULL values and Oracle

Allowing NULL values into your columns introduces a whole new degree of uncertainty into your database. The primary reason behind is comparing NULL to a value will result neither TRUE or FALSE, in fact we may say it is UNKNOWN. Here is a simple demostration;


set serveroutput on
create or replace procedure IsNotApple(b varchar2) as
begin
if b !='Apple' then
dbms_output.put_line('Not Apple');
else
dbms_output.put_line('Apple');
end if;
end;
/

begin
isNotApple(Null);
end;
/
Apple

PL/SQL procedure successfully completed

Oracle implements NULL values as ANSI SQL requires them to be implemented. But databases are different, experience in one will in part carry over to another. You must be ready for some fundamental differences, such as locking, as well as some very minor differences, such as treatment of Nulls in this example, can have unexpected impacts on your applications. Being aware of the database and how it works and how its features are implemented is the only way to overcome these issues, again another “read the documentation” conclusion :)


drop table tab_null_test purge ;
create table tab_null_test ( team_no number primary key, team_name varchar2(64) unique, availability varchar2(1));
insert into tab_null_test values (1, ‘Bir', NULL);
insert into tab_null_test values (2, ‘Iki', ‘Y');
insert into tab_null_test values (3, ‘Uc', ‘N');
commit;
analyze table tab_null_test compute statistics for table for all columns;

set null ^ -- will print "^" instead of NULL
select * from tab_null_test ;

TEAM_NO TEAM_NAME A
---------- ---------------------------------------------------------------- -
1 Bir ^
2 Iki Y
3 Uc N

select count(team_no) from tab_null_test ;

COUNT(TEAM_NO)
--------------
3

select count(availability) from tab_null_test ;

COUNT(AVAILABILITY)
-------------------
2

select * from tab_null_test where availability <> ‘Y' ;

TEAM_NO TEAM_NAME A
---------- ---------------------------------------------------------------- -
3 Uc N

select * from tab_null_test where nvl(availability, ‘N') <> ‘Y' ;

TEAM_NO TEAM_NAME A
---------- ---------------------------------------------------------------- -
1 Bir ^
3 Uc N

select * from tab_null_test where availability = NULL ;

no rows selected

select * from tab_null_test where availability <> NULL ;

no rows selected

select * from tab_null_test where availability IS NULL ;

TEAM_NO TEAM_NAME A
---------- ---------------------------------------------------------------- -
1 Bir ^

Also if you need to find NULL values with B*tree indexes on Oracle, you must be carefull;


drop table tab_null_test purge;
create table tab_null_test ( x int, y int );
create unique index unique_tab_null_test on tab_null_test(x,y);
insert into tab_null_test values ( 1, 1 );
insert into tab_null_test values ( 1, NULL );
insert into tab_null_test values ( NULL, 1 );
insert into tab_null_test values ( NULL, NULL );
analyze table tab_null_test compute statistics for table for all columns;
analyze index unique_tab_null_test validate structure;
select name, lf_rows from index_stats;

NAME LF_ROWS
------------------------------ ----------
UNIQUE_TAB_NULL_TEST 3

The table has four rows, whereas the index only has three. The first three rows, where at least one of the index key elements was not Null, are in the index. The last row, with (NULL, NULL) is not in the index.


insert into tab_null_test values ( NULL, NULL );

1 row created.

insert into tab_null_test values ( NULL, 1 );
*
ERROR at line 1:
ORA-00001: unique constraint (HR.UNIQUE_TAB_NULL_TEST) violated

insert into tab_null_test values ( 1, NULL );
*
ERROR at line 1:
ORA-00001: unique constraint (HR.UNIQUE_TAB_NULL_TEST) violated

select x, y, count(*)
from tab_null_test group by x,y
having count(*) > 1;

X Y COUNT(*)
---------- ---------- ----------
^ ^ 2

In Oracle, (NULL, NULL) is in deed not equal to (NULL, NULL). That is something to consider: each unique constraint should have at least one NOT NULL column in order to be truly unique.


set autotrace traceonly explain
select * from tab_null_test where x is null;

Execution Plan
----------------------------------------------------------
Plan hash value: 3396731321

-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 6 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TAB_NULL_TEST | 3 | 6 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("X" IS NULL)

set autotrace off
drop table tab_null_test purge;
create table tab_null_test ( x int, y int NOT NULL );
create unique index unique_tab_null_test on tab_null_test(x,y);
insert into tab_null_test values ( 1, 1 );
insert into tab_null_test values ( NULL, 1 );
analyze table tab_null_test compute statistics for all columns;
set autotrace traceonly explain
select * from tab_null_test where x is null;

Execution Plan
----------------------------------------------------------
Plan hash value: 2452341312
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 2 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| UNIQUE_TAB_NULL_TEST | 1 | 4 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("X" IS NULL)

NULLS can also affect the outcome of the analytic functions, especially when you use a descending sort. By default, NULLS are greater than any other value;


set autot off
drop table emp purge ;
create table emp as select employee_id empno,
first_name||' ‘||last_name ename, commission_pct comm from employees
order by employee_id ;
analyze table emp compute statistics ;
select ename, comm from emp order by comm desc;

ENAME COMM
---------------------------------------------- ----------
Steven King ^
Neena Kochhar ^
Lex De Haan ^
Alexander Hunold ^
Bruce Ernst ^
David Austin ^
Valli Pataballa ^
Diana Lorentz ^
Nancy Greenberg ^
Daniel Faviet ^
John Chen ^
...
William Smith ,15
Oliver Tuvault ,15
Mattea Marvins ,1
David Lee ,1
Sundar Ande ,1
Charles Johnson ,1
Sundita Kumar ,1
Amit Banda ,1

107 rows selected.

select ename, comm, dr
from (select ename, comm, dense_rank() over (order by comm desc) dr from emp)
where dr <= 5 order by comm;

ENAME COMM DR
---------------------------------------------- ---------- ----------
David Bernstein ,25 5
Peter Hall ,25 5
Clara Vishney ,25 5
Sarath Sewall ,25 5
Alyssa Hutton ,25 5
Lisa Ozer ,25 5
Peter Tucker ,3 4
Lindsey Smith ,3 4
Ellen Abel ,3 4
Gerald Cambrault ,3 4
Alberto Errazuriz ,3 4
Karen Partners ,3 4
Louise Doran ,3 4
Patrick Sully ,35 3
Janette King ,35 3
Allan McEwen ,35 3
John Russell ,4 2
Steven King ^ 1
Neena Kochhar ^ 1
Lex De Haan ^ 1
Alexander Hunold ^ 1
Bruce Ernst ^ 1
...

While this may technically be “correct” it probably is not what you really wanted. You would either not want NULLS to be considered at all or to have NULLS be interpreted as “small” in this case. So, we can either remove NULLS from consideration, using “where comm is not null”:


select ename, comm, dr
from (select ename, comm, dense_rank() over (order by comm desc) dr
from emp where comm is not null)
where dr <= 5 order by comm desc;

ENAME COMM DR
---------------------------------------------- ---------- ----------
John Russell ,4 1
Janette King ,35 2
Allan McEwen ,35 2
Patrick Sully ,35 2
Karen Partners ,3 3
Alberto Errazuriz ,3 3
Peter Tucker ,3 3
Gerald Cambrault ,3 3
Louise Doran ,3 3
Ellen Abel ,3 3
Lindsey Smith ,3 3
...

or we can use the NULLS LAST extension to the ORDER BY clause:


select ename, comm, dr
from (select ename, comm,dense_rank() over (order by comm desc nulls last) dr
from emp)
where dr <= 5 order by comm desc;

ENAME COMM DR
---------------------------------------------- ---------- ----------
John Russell ,4 1
Janette King ,35 2
Patrick Sully ,35 2
Allan McEwen ,35 2
Louise Doran ,3 3
Ellen Abel ,3 3
Peter Tucker ,3 3
Gerald Cambrault ,3 3
Lindsey Smith ,3 3
Karen Partners ,3 3
Alberto Errazuriz ,3 3
...

It should be noted that the NULLS LAST works on a normal ORDER BY as well; its use is not limited to the analytic functions.

Also for aggreagate functions like sum, count, avg will be affected by NULL values. For your need you may prefer to use NVL function;


SQL> select avg(salary), avg(nvl(salary, 0)) from hr.employees ;

AVG(SALARY) AVG(NVL(SALARY,0))
----------- ------------------
9304,822429   9304,82242990654

SQL> update hr.employees set salary = NULL where employee_id = 100 ;

1 row updated

SQL> select avg(salary), avg(nvl(salary, 0)) from hr.employees ;

AVG(SALARY) AVG(NVL(SALARY,0))
----------- ------------------
9066,566037   8981,83177570093

Continue reading with part 4

Testing Information : the scripts mentioned are tested on Oracle Database 10g Express Edition Release 10.2.0.1.0

Refences Used :
“Null” thread on AskTom
Expert one-on-one Oracle by Thomas Kyte Database Independence?, Page 57 – Indexes and Nulls, Page 378 – NULLS and Sorting, Page 731
http://www.databasedesign-resource.com/null-values-in-a-database.html
Nulls: Nothing to Worry About By Lex de Haan and Jonathan Gennick
Title is NULL by APC

January 6, 2007

Oracle Concepts and Architecture – Part 2

Filed under: Oracle Concepts and Architecture — H.Tonguç Yılmaz @ 1:31 pm

In part 1 I started to discuss below subjects;

1- Flashback Versions Query, Drop Table and Recyclebin Management
2- How shared pool works in Oracle and binding example
3- The Storage Hierarchy Summary in an Oracle Database
4- Concurrency and Isolation Levels Concepts
5- Automated Checkpoint Tuning and Mean time to recover(MTTR) Concepts
6- Concurrency and Consistency Concepts
7- Latch and Lock Concepts

I will continue with an Oracle Database Architecture overview;

8- Oracle Database Architecture: Overview

a) Oracle database and an Oracle instance

The database consists of physical structures such as;
Control files: These files contain data about the database itself. These files are critical to the database. Without them you cannot open the database. Check view v$controlfile for your controlfile settings.
Data files: These files contain the data of the database. Check view v$datafile for your data file settings.
Online redo log files: These files allow for instance recovery of the database. If the database were to crash, the database can be recovered with the information in these files. Check view v$logfile and v$log for your redo log file settings.

There are other files that are not officially part of the database but are important to the successful running of the database:
Parameter file: The parameter file is used to define how the instance is configured on startup. Check view v$parameter for your parameter settings.
Password file: This file enables super users to connect remotely to the database and perform administrative tasks.
Archive log files: These files ensure database recovery and are copies of the online redo log files. Using these files and a backup of the database, you can recover a lost data file. Check view v$archived_log for your redo log file settings.

The instance consists of memory structures such as System Global Area(SGA) and Program Global Area(PGA) and background processes that perform tasks within the database as well as the server processes that are initiated for each user session.

The size and structure of an Oracle database and instance impact performance. The physical structure of the database impacts the I/O to hard disks. It is therefore important to both size and place the physical files in such a way that I/O across disks is distributed evenly and waits are minimized. The size of the various memory areas of the instance directly impacts the speed of SQL processing.

The default Oracle database, created by the Oracle Universal Installer(OUI), is preconfigured with initial settings for the memory parameters. The performance of the database depends on the sizing of these memory parameters, so you should fine tune it to meet the requirements of your growing database.

Two memory parameters, PGA_AGGREGATE_TARGET and SGA_TARGET, are provided that allow the database to automatically resize the memory structures within the SGA and PGA. These parameters can be set based on the recommendations of Automatic Database Diagnostics Monitor(ADDM), which is available with the Enterprise Edition of Oracle Database 10g, or you can manually run several advisors and use the combined recommendations of these advisors to set the sizes appropriately.

The basic memory structures associated with an Oracle instance include:
System Global Area(SGA): Shared by all server and background processes
Program Global Area(PGA): Exclusive to each server and background process. There is one PGA for each server process.

The System Global Area(SGA) consists of the following data structures:
Database buffer cache: Caches blocks of data retrieved from the data files
Redo log buffer: Caches redo information (used for instance recovery) until it can be written to the physical redo log files stored on disk
Shared pool: Caches various constructs that can be shared among users
Large pool: Optional area in the SGA that provides large memory allocations for Oracle backup and restore operations, I/O server processes, and session memory for the shared server
Java pool: Used for all session-specific Java code and data within the Java Virtual Machine (JVM)
Streams pool: Used by Oracle Streams

The Program Global Area(PGA) is a memory region which contains data and control information for each server process. A server process is a process that services a client’s requests. Each server process has its own private PGA that is created when the server process is started. Only a server process can access its own PGA. Generally, the PGA contains the following:
Private SQL area: Contains data such as bind information and run-time memory structures. Each session that issues a SQL statement has a private SQL area.
Session memory: Memory allocated to hold session variables and other information related to the session

b) Connecting to an Instance

When a user starts a tool such as SQL*Plus or connects to the database using an application, the application or tool is executed in a user process. When a user actually logs on to the Oracle database, a process is created on the computer running the Oracle database. The listener on the Oracle database actually establishes the connection and directs the request to an available server process. The server process communicates with the Oracle instance on behalf of the user process that runs on the client. The server process executes SQL statements on behalf of the user.

A connection is a communication pathway between a user process and an Oracle database. A database user can connect to an Oracle database in one of three ways:
• The user logs on to the machine running the Oracle instance and starts an application or tool that accesses the database on that system. The communication pathway is established using the interprocess communication mechanisms available on the host operating system.
• The user starts the application or tool on a local computer and connects over a network to the computer running the Oracle instance. In this configuration, called client/server, network software is used to communicate between the user and the Oracle database.
• In a three-tiered connection, the user’s computer communicates over the network to an application or a network server, which is connected through a network to the machine running the Oracle instance. For example, the user runs a browser on a network computer to use an application residing on an NT server that retrieves data from an Oracle database running on a UNIX host.

A session is a specific connection of a user to an Oracle database. The session starts when the user is validated by the Oracle database, and it ends when the user logs out or when there is an abnormal termination. For a given database user, many concurrent sessions are possible if the user logs on from many tools, applications, or terminals at the same time. Except for some specialized database administration tools, starting a database session requires that the Oracle database be available for use.

Good database connection management offers benefits in minimizing the number of connections, thereby increasing scalability.

c) Fast COMMIT and System Change Number

The Oracle Database uses a Fast COMMIT mechanism that guarantees the committed changes can be recovered in case of instance failure. Whenever a transaction commits, the Oracle Database assigns a commit system change number(SCN) to the transaction. The SCN is monotonically incremented and is unique within the database. It is used by the Oracle Database as an internal time stamp to synchronize data and to provide read consistency when data is retrieved from the data files. Using the SCN enables the Oracle Database to perform consistency checks without depending on the date and time of the operating system.

When a COMMIT is issued, the following steps are performed:
• The server process places a commit record, along with the SCN, in the redo log buffer.
• The background Log Writer process (LGWR) performs a contiguous write of all the redo log buffer entries up to and including the commit record to the redo log files. After this point, the Oracle Database can guarantee that the changes will not be lost even if there is an instance failure.
• The server process provides feedback to the user process about the completion of the transaction.

DBWR eventually writes the actual changes back to disk based on its own internal timing mechanism.

d) Factors to be Managed and Top Oracle Performance Issues on this Architecture

Performance management can be divided into the following four areas. Although the areas are separate, they are also interdependent and require different skill sets;

Schema tuning deals with the physical structure of the data. If an application has inadequate or inappropriate data design, then tuning the physical allocation, providing indexes, or rewriting programs will not overcome the problem.

Application tuning deals with such business requirements as 24/7 availability, OLAP, OLTP, and so on as well as the program modules or applications that implement the functions. Tuning the procedural code for the type of application and tuning the embedded SQL statements are also included in this factor. If an application is well designed, it may still perform badly. A common reason for this is badly written SQL.

Instance tuning deals with the configuration of the Oracle server for memory utilization.

Database tuning deals with managing the physical arrangement of data on the disk.

User expectations: Usually users expect consistent performance on all applications. However, they may accept certain applications (such as OLAP operations) as slower if the project team builds realistic user expectations. An application may include messages to warn operators that they are requesting resource-intensive operations. The best time to do this is before the design and build phases and as part of the transition phase.–

Hardware and network tuning deals with performance issues arising from the CPU and from network traffic on all machines supporting the application. The main hardware components are:
CPU: There can be one or more CPUs, and they can vary in processing power from simple CPUs found in hand-held devices to high-powered server CPUs. Sizing of other hardware components is usually a multiple of the CPUs on the system.
Memory: Databases require considerable amounts of memory to cache data and avoid time-consuming disk access.
I/O subsystem: The I/O subsystem can vary between the hard disk on a client PC and high-performance disk arrays. Disk arrays can perform thousands of I/Os each second and provide availability through redundancy in terms of multiple I/O paths and hot pluggable mirrored disks.
Network: The primary concerns with network specifications are bandwidth (volume) and latency (speed).

Top Oracle performance issues reported on metalink Oracle global support site are;

Bad connection management: The application connects and disconnects for each database interaction. This problem is common with stateless middleware in application servers. Additionally, simultaneous connections from the same client are also a waste of system and network resources.

Poor use of cursors and the shared pool: Not reusing cursors results in repeated parses. If bind variables are not used, then there is hard parsing of all SQL statements. This has an order-of-magnitude impact in performance, and it is totally unscalable. Use cursors with bind variables that open the cursor and execute it many times. Be suspicious of applications generating dynamic SQL.

Bad SQL: Bad SQL is SQL that uses more resources than appropriate for the application requirement. This can be a decision support systems (DSS) query that runs for more than 24 hours or a query from an online application that takes more than a minute. SQL that consumes significant system resources should be investigated for potential improvement. ADDM identifies high-load SQL, and the SQL Tuning Advisor can be used to provide recommendations for improvement.

Use of nonstandard initialization parameters: These might have been implemented based on poor advice or incorrect assumptions. Most systems will give acceptable performance using only the set of basic parameters. In particular, parameters associated with SPIN_COUNT on latches and undocumented optimizer features can cause a great deal of problems that can require considerable investigation.

I/O issues: If you configure your database to use multiple disks by disk space and not I/O bandwidth, then there will be excessive I/O to certain disks and little I/O to others. Frequently and simultaneously accessed objects (a table and its index) should be designed to be stored over different disks.

Long full-table scans: Long full-table scans for high-volume or interactive online operations could indicate poor transaction design, missing indexes, or poor SQL optimization.

In-disk sorting: In-disk sorts for online operations could indicate poor transaction design, missing indexes, or poor SQL optimization. Disk sorts, by nature, are I/O-intensive and unscalable.

High amounts of recursive SQL: Large amounts of recursive SQL executed by SYS could indicate space management activities, such as extent allocations, taking place. This is unscalable and impacts user response time. Recursive SQL executed under another user ID is probably SQL and PL/SQL, and this is not a problem.

Schema errors and optimizer problems: In many cases, an application uses too many resources because the schema owning the tables has not been successfully migrated from the development environment or from an older implementation. Examples of this are missing indexes or incorrect statistics. These errors can lead to suboptimal execution plans and poor interactive user performance. When migrating applications of known performance, you should export the schema statistics to maintain plan stability by using the DBMS_STATS package.

Continue reading with part 3

Refences Used :
Oracle® Database Concepts 10g Release 2 (10.2)

Next Page »

Create a free website or blog at WordPress.com.