Extract(Read) > Transform(Test&Apply) > Load(Learn) Blog

January 28, 2007

Somethings never change, be carefull

Filed under: Oracle Concepts and Architecture — H.Tonguç Yılmaz @ 9:24 pm

In one of my previous posts I wanted to attract attention to the “change”; Technology changes very fast, keep UNlearning. But some concepts do not change :)

In Oracle if you use Data Definition Language(DDL) in your code your transaction logic will be broken because there is an implicit COMMIT immediately before parsing and executing phases of the DDL;


Begin
   Commit; -- this breaks the transaction, whatever your DDL wiil result doesnt matter
   Parse DDL; -- syntax and privileges need to be controlled
   Begin
      Execute DDL;
      Commit;
   Exception
      When Others Then
        Rollback;
   End;
End;

Code listing 65a : Using DDL in a transaction Example

Another important concept in Oracle is full table scans always reads all the blocks under the high-water mark of the related segment.

Code listing 65b : Full Table Scan and High Water Mark Relation Example

Like DDL you must also be carefull with DELETEs in your applications, especially after huge DELETEs you need to perform CTAS or MOVE operations on the table if you need FTS performance. Or most probably it should better you use partitioning instead of this DELETE operations :)

The use of SQL in triggers may result excessive parsing problem, move SQL in the triggers into PL/SQL and call them from triggers;

Code listing 65c : Move SQL out of the trigger Example

Testing Information : the scripts mentioned are tested on Oracle Database 10g Express Edition Release 10.2.0.1.0

Refences Used :
Effective Oracle by Design by Thomas Kyte Chapter 5- Statement Processing, Page 264
Effective Oracle by Design by Thomas Kyte Chapter 8- Effective SQL, Page 453
Effective Oracle by Design by Thomas Kyte Chapter 5- Statement Processing, Page 299

Advertisements

January 23, 2007

Oracle Business Intelligence and Extraction, Transformation, Loading(ETL) Part 2

Filed under: Oracle Business Intelligence — H.Tonguç Yılmaz @ 9:58 am

In part 1 I mentioned the Parallel execution option of Oracle.

2- Introduction to Partitioning

Partitioning is defined as breaking down a large problem into smaller manageable pieces, the concept of Divide and Conquer. So with Oracle, partitioning refers to “Dividing Tables and Indexes into manageable pieces.”

Partitioning feature is used;
To increase availability; this is derived from the fact that partitions are independent entities,
To ease administration burdens; without partitions these operations will generally be a massive INSERT, UPDATE and DELETE(lots of redo&rollback I/O) statements, like archiving of data is on a schedule and repetitive,
To enhance DML and query performance; during Parallel DML Oracle uses many threads or process to perform your INSERT,UPDATE or DELETE instead of single process with partitioning.

Also storing partitions in separate tablespaces enables you to;

– Reduce the possibility of data corruption in multiple partitions
– Back up and recover each partition independently
– Control the mapping of partitions to disk drives (important for balancing I/O load)

After Oracle 9i Release 2 Range, Hash, List or Composite(Range-Hash or Range-List) can be chosen as a partitioning method. Following demostrations show the benefits of partitioning;

Code listing 64a : Convert A Partition Into A Stand-alone Table Example and Split Partition Example

Code listing 64b : Partition Elimination Example

Code listing 64c : OLTP and LOCAL versus GLOBAL index partitioning Example

As a conclusion, we have 5 table partitioning methods; Range, Hash, List, Range-Hash, Range-List and 3 index partitioning methods; Global Non-partition, Global partition, Locally partition with Oracle. And important considerations before partitioning a table or an index are;
– Is the table the “right” size and how volatile is its data?
– What are your maintenance considerations?
– How queries filter data on this data and how to choose the partition key?

To benefit when using partitioning option, like parallelism mentioned in part 1, it is vital that you have a goal you want to achieve with it. Also “Partitioning <> fast=true”.

Testing Information : the scripts mentioned are tested on Oracle Database 10g Enterprise Edition Release 10.2.0.2.0

Refences Used :
http://www.psoug.org/reference/partitions.html
Oracle® Database Concepts 10g Release 2 (10.2) Chapter 18 Partitioned Tables and Indexes
Oracle® Database Administrator’s Guide 10g Release 2 (10.2) Chapter 17 Managing Partitioned Tables and Indexes
Drop parent partition post by Jonathan Lewis
Effective Oracle by Design by Thomas Kyte Chapter 3-Architectural Decisions, Page 185

January 20, 2007

Optimizer debug trace event – 10053 trace file

Filed under: Oracle Performance — H.Tonguç Yılmaz @ 7:53 pm

In the first part, Understanding The Fundamental Performance Equation I mentioned the importance of the fundamental performance equation, how to produce SQL Trace files and analyze them with tkprof utulity. A good understanding of SQL processing is essential for writing optimal SQL statements and these trace files help us to understand what happened behind during execution.

I always wondered how and why Oracle optimizer decides to choose one execution plan to other. Sometimes Oracle seems to be ignoring something that you think it shouldn’t, maybe a hint or a join condition. Usually the best way to work out what’s going on is to look at the optimizer debug trace event, 10053 trace file. Maybe you are hitting a CBO bug and this trace file offers best for you to show Oracle support something unexpected happening.

I tried to create a small demostration for a better understanding of this event dump, following script creates a table with a primary key and unique key constraints and the CBO decides to use the unique key constraint over primary key;

Code listing 63 : Optimizer debug trace event, 10053 trace file

The trace file we produce has lots of information as expected; all session optimizer parameter values and every table, index and possible access path statistics etc. Some abreviations used maybe problematic to understand, here’s a quick explanation of what the abbreviations mean:

CDN Cardinality, a count of rows
NBLKS Number of blocks
AVG_ROW_LEN The computed average row length
COL# Column numbers in the table the index is on
LVLS Number of levels in the B*Tree
#LB Number of leaf blocks
#DK Number of distinct keys
LB/K Number of leaf blocks per key value on average
DB/K Number of base table data blocks per key value; how many table accesses (logical I/O’s) would be made using an equality predicate on this index. Directly related to the cluster factor below.
CLUF Clustering factor of this index; a measure of how sorted a base table is with respect to this index.
resc Serial cost
resp Parallel cost

The 10053 trace file can be big if you are working on lots of joined tables since every possible combination will be dumped to the trace file. This trace event is not documented, but from a search on google you can find several good resources easily. Cost-based query optimization is something we must call the real computer science, a hard thing to do and with every new release Oracle optimizes the code dramatically. So lets enjoy it :)

Testing Information : the scripts mentioned are tested on Oracle Database 10g Express Edition Release 10.2.0.1.0

References Used :
Cost Based Oracle Fundamentals by Jonathan Lewis Chapter 14-The 10053 Trace File, Page 403
10053 thread on Asktom
“A Look under the Hood of CBO – the 10053 Event” by Wolfgang Breitling
Note:338137.1 – CASE STUDY: Analyzing 10053 Trace Files

Direct path inserts, nologging option and index cost

Filed under: Oracle Performance — H.Tonguç Yılmaz @ 12:46 pm

This is something confuses lots of us, a direct-path insert will only produce minimum redo(not no redo) if you are using a database in NOARCHIVELOG mode or you are doing this operation on a table which is marked as NOLOGGING.

Some other important points are;
– “with all insert statements append hint will produce minimum redo”, this is not true and actually only with bulk inserts(insert into select from) this feature works.
– only one session can at a time can direct path insert into a table.
– with append you can NOT use the free space below your high water mark.
– during direct load if there are indexes on the table, if you set the indexes as UNUSABLE and rebuild them after the load, redo produced minimizes dramatically.

Folowing is a demostration of these behaviours with autotrace feature;

Code listing 62 : Direct inserts with append hint and Nologging option

Testing Information : the scripts mentioned are tested on Oracle Database 10g Express Edition Release 10.2.0.1.0

Refences Used :
Effective Oracle by Design by Thomas Kyte Chapter 2-Your performance toolkit, Page 116

January 19, 2007

What is the overhead of Referential Integrity on the database

Filed under: Oracle Performance — H.Tonguç Yılmaz @ 10:27 am

A long time debate and a myth again today came in front of me; “Using Integrity Constraint on the database makes applications slower”

Please answer these two questions if you believe in this myth;
– Is your application only one that will access this data?
– How can you really believe that you can do these checks on client or middle-ware better than the database, where your data lives?

Here is a simple test case, again let the numbers speak;

Code listing 61 : Cost of Referential Integrity in the database

Result summary is as follows: (22.00-16.23)/196161=0,002941461350625251706506390159104 of CPU seconds per row for my test, worth for the integrity of your corporates data believe me..

Also, Mr.Kyte advices: “Are the consultants who are advising getting rid of the database constraint paid by the line of code they will produce” :)

Remember applications and technologies(cobol, client-server etc.) come and go, but your organization’s data lives forever!

Testing Information : the scripts mentioned are tested on Oracle Database 10g Enterprise Edition Release 10.2.0.2.0

Refences Used :
Effective Oracle by Design by Thomas Kyte Chapter 1-The right approach to building applications, Page 24

Basic Analytics versus Generic SQL

Filed under: Oracle Performance — H.Tonguç Yılmaz @ 7:13 am

Some days ago we found out a new comparison study on psoug and send an mail to OracleTURK to discuss on these differences especially with people who have also MS SQL knowledge.

For me a database which runs on just a spesific plathform is already out of the game. Oracle runs on;
– AIX 5L Based Systems (64-Bit)
– hp OpenVMS, hp Tru64 UNIX, hp-ux Itanium, hp-ux PA-RISC (64-Bit)
IBM z/OS (OS/390), IBM zSeries Based Linux
– Linux Itanium, Linux on POWER, Linux x86, Linux x86-64
Microsoft Windows, Microsoft Windows (64-Bit) on Intel Itanium, Microsoft Windows (x64)
– Solaris Operating System (SPARC 64-Bit), Solaris Operating System (x86), Solaris Operating System (x86-64)

But since MS SQL is one of the largest in the market I was very suprised to see these difference;

– No Bitmap Index,
– No Materialized Views,
– No Model Functions,
– No Hierarchical Operators,
– Limited Analytic Functions

I always thought database vendors in time got very close to each other. I know there are fundamental differences like locking mechanism and NULL behavior, but especially this differences in the features vendors supply really surprised me.

So I wanted to experience how my daily life without same basic Oracle analytic functions could be, here are some results of my following test, let the numbers speak;

Code listing 60 : Basic Analytics versus Generic Sql

Rows in table CPU Analytics CPU Generic Elapsed Analtics Elapsed Generic Consist. Gets Analytics Consist. Gets Generic
1,000 95 206 00:00.06 00:00.12 4 16
10,000 142 791 00:00.53 00:05.96 17 68
100,000 624 64311 00:05.34 10:41.09 171 122192

Ok, nothing unexpected here. Someone may suggest these functionalities may to be written in T-SQL etc. but be careful above results are showing some proofs of the Universal mantra on data intensive processing; “You should do it in a single SQL statement if at all possible” No function implemented by pl/sql or t-sql etc. outside the SQL engine could beat them in terms of resource consumption, so as a result responce timings.

Just after my conclusions, a good news from Mr.Thomas Kyte; “..been thinking about writing a book just about analytics (but wait’ll you see the SQL Model clause in 10g)..” I am sure that book will be great :)

Testing Information : the scripts mentioned are tested on Oracle Database 10g Enterprise Edition Release 10.2.0.2.0

Refences Used :
Effective Oracle by Design by Thomas Kyte Chapter 1-The right approach to building applications, Page 15

January 16, 2007

There are more than one INDEX Type Part 2

Filed under: Oracle How To,Oracle Performance — H.Tonguç Yılmaz @ 8:02 am

In Part 1 after a brief introduction on default indexing feature B*Tree Indexes in Oracle I mentioned Virtual / NoSegment Indexes.

2- Reverse Key Index and Index Range Scan Problem

Using a reverse index on a column fed by a sequence spreads the location of sequential numbers across a wider range of leaf blocks and the problem of a single hot block is removed because the index entries are stored in reverse order.

These indexes are designed to eliminate index hot spots on insert applications and are excellent for insert performance, but they are limited in that they cannot be used for index range scans. Here is a demostration of this problem;

Code listing 59-a : Reverse Key Index and Index Range Scan

3- How to disable an index

Especially before a bulk operation on your tables you might prefer to disable triggers, indexes and constraints for performance;

SELECT table_name, index_name,
'ALTER INDEX ' || index_name || ' UNUSABLE;' disable,
'ALTER INDEX ' || index_name || ' REBUILD NOLOGGING;' enable
FROM user_indexes
WHERE ...

SELECT table_name, trigger_name,
'ALTER TRIGGER ' || trigger_name || ' DISABLE;' disable,
'ALTER TRIGGER ' || trigger_name || ' ENABLE;' enable
FROM user_triggers
WHERE ...

SELECT table_name, constraint_name,
'ALTER TRIGGER ' || constraint_name || ' DISABLE;' disable,
'ALTER TRIGGER ' || constraint_name || ' ENABLE;' enable
FROM user_constraints
WHERE ...

But after migration you may experience some side impacts like dirty data and can not be able to enable some of your constraints. Here is a demostration of how to disable an index;

Code listing 59-b : How to Disable an Index

Continue reading with Part 3

Testing Information : the scripts mentioned are tested on Oracle Database 10g Enterprise Edition Release 10.2.0.2.0

Refences Used :
“Reverse Key” thread on askTom
Oracle® Database Performance Tuning Guide 10g Release 2 (10.2) Chapter 2 Designing and Developing for Performance
The Oracle ™ Users’ Co-Operative FAQ : “Why would a reverse index be useful when the leading column of the index is generated from a sequence?”

January 15, 2007

Oracle Express Edition(XE) and Application Express Part 1

Filed under: Oracle Express Edition(XE) and Application Express — H.Tonguç Yılmaz @ 6:55 am

Oracle Database 10g Express Edition(Oracle Database XE) is an entry-level, small-footprint database based on the Oracle Database 10g Release 2 code base that’s free to develop, deploy, and distribute; fast to download; and simple to administer. In 2005 Oracle XE was announced by Oracle as a free and limited Oracle 10gR2 database. Oracle Database XE can be installed on any size host machine with any number of CPUs(one database per machine), but XE will store up to 4GB of user data, use up to 1GB of memory, and use one CPU on the host machine. Also some features like Partitioning, Bitmap Indexes and Materialized View Query Rewrite are not included, but before making a decision on this release check the list of features included from Licience Documentation before making a decision for XE.

Also be sure to register for an exclusive Oracle Database 10g Express Edition Discussion Forum hosted by Oracle expert Tom Kyte. When I asked Mr.Tom Kyte about XE’s future he said; “there will be future releases, but only with the next major release of the database. Security patches may cause us to re-release XE, but it’ll be the patched version as it exists now. APEX is a pretty flexible environment, I know the guys that work on it, they are very careful not to have too many complex dependencies on the database version.” So since Oracle 11g is announced we will be enjoying a new XE version soon.

Oracle Application Express(Oracle APEX, previously named Oracle HTML DB) is a free software development environment based on the Oracle database. It was also previously known as “Project Marvel”, in 1997 with Oracle version 8i WebDB introduced and eventually browser-based development and deployment tools of Oracle matured into Portal and ApEx today. It allows a very fast development cycle to be achieved to create web based applications. This viewlet is a demostration on turning an excel sheet into a web application in seconds.

Oracle Application Express can be installed on an Oracle 9.2 or higher database. In January 2006 Oracle renamed HTML DB to Oracle Application Express. One of the most well known two applications developed in HTML DB is the askTom with Thomas Kyte and Best Practice PL/SQL with Steven Feuerstein

Apex comes standard with Oracle XE database, no need for another installation. The url used to access the XE database homepage typically takes one of the following forms:

Here are the instructions how you may check the information from SQL*Plus;


SQL> conn hr/hr@xe
Connected.

SQL> select dbms_xdb.gethttpport, dbms_xdb.getftpport from dual;

GETHTTPPORT GETFTPPORT
———– ———-
8080 0

SQL> SET SERVEROUTPUT ON;
SQL> DECLARE
2    l_dad_names DBMS_EPG.varchar2_table;
3 BEGIN
4    DBMS_OUTPUT.put_line(’DADs’);
5    DBMS_OUTPUT.put_line(’==========’);
6
7    DBMS_EPG.GET_DAD_LIST (l_dad_names);
8
9    FOR i IN 1 .. l_dad_names.count LOOP
10      DBMS_OUTPUT.put_line(l_dad_names(i));
11   END LOOP;
12 END;
13 /

DADs
==========
APEX

PL/SQL procedure successfully completed.

Also if you want to open your local ApEx applications on XE instance to remote clients, you should follow instructions given at Oracle® Database Express Edition Installation Guide. With Apex comes some great Demostration Applications which help understanding. And also there is an online development center, free to request a workspace and start building internet applications. With Apex you may shortly start with examples and learn, try this Survey Application for example.

The Application Express 2.2.1 latest patch set, to be applied to an Application Express 2.2 instance, can be downloaded from MetaLink(Patch Number 5500033) and is about 2.6MB. Due out in the first half of 2007, the next major release of Oracle Application Express will be version 3.0.

So if you are after a browser based and wizard driven rapid application development(RAD) tool on the Oracle database, Oracle XE and ApEx, with no additional charge, is waiting you for a start. Enjoy the qualities of a personal database like;

  • Productivity
  • Ease of Use
  • Flexibility

with the qualities of an enterprise database like;

  • Security
  • Integrity
  • Scalability
  • Availability
  • Portability

Continue reading with Part 2

For more informations please check these links

Advanced Programming Examples with ApEx

The list of Apex Blogs I suggest you to follow

January 14, 2007

Oracle, Linux and Shell Scripting Part 1

Filed under: Oracle, Linux and Shell Scripting — H.Tonguç Yılmaz @ 7:51 am

Ok here starts another series, this road is targeted “power”, power here which I may summaries as Oracle on Linux. Linux refers to any Unix-like computer operating system that uses the Linux kernel.

Finnish university student Linus Torvalds originally used Minix on his own computer in 1991 as a hobby while attending the University of Helsinki. Minix is a simplified Unix-like system written by Andrew S. Tanenbaum for teaching operating system design. However, Tanenbaum did not permit others to extend his operating system, leading Torvalds to create a replacement for Minix.

Oracle released the first commercial database on Linux in 1998. Today, Oracle is the only major software vendor to collaborate with and provide first-line support for Red Hat, Novell, and Asianux. And all Oracle products are available on Linux.

Oracle Announced Enterprise-Class Support for Linux at Openworld 2006

Oracle Unbreakable Linux is a support program of Oracle that provides enterprises with industry-leading global support for Linux. Oracle also now offers Enterprise Linux software as a free download. It is all currently explained in more detail on the Unbreakable Linux Network.

Oracle Maintains Top Spot in RDBMS Market, Increases Share in Linux Segment

According to this Forbes article, Gartner’s 2005 RDBMS marketshare study showed Oracle has the biggest share in this market. Oracle enjoys about 50% of the total RDBMS market and 55% on distributed systems. Both IBM and Oracle posted similar year-over-year growth in their database businesses in 2005, with Oracle growing slightly faster than IBM.

Also, Oracle extended its lead in the RDBMS on Linux market. While Oracle seems to have lost share on the Windows platform, it remains the leader on the fastest-growing Linux platform where the market grew by 84% and Oracle consolidated its top spot by increasing its share to 81% from 76%.

Here are my favorite resources on Linux and Oracle

You must read these articles before we proceed, really beautifull please check them

So lets take some time for these resources :)

January 13, 2007

Oracle Concepts and Architecture – Part 3

Filed under: Oracle Concepts and Architecture — H.Tonguç Yılmaz @ 6:29 am

In part 1 I have written on below subjects;

1- Flashback Versions Query, Drop Table and Recyclebin Management
2- How shared pool works in Oracle and binding example
3- The Storage Hierarchy Summary in an Oracle Database
4- Concurrency and Isolation Levels Concepts
5- Automated Checkpoint Tuning and Mean time to recover(MTTR) Concepts
6- Concurrency and Consistency Concepts
7- Latch and Lock Concepts

In part 2 I have written on below subject;

8- Oracle Database Architecture: Overview

On this post I want to discuss Atomicity and NULL values with Oracle.

9- How Atomicity is implemented on Oracle

One of the primary tests for DBMS reliability is what’s known as the ACID test. ACID stands for Atomicity, Consistency, Isolation and Durability. ACID-compliant systems are as close as you can get to guaranteed not to lose your data. Essentially, as long as your database files are intact, you are guaranteed that your data is consistent. Meaning of the word “atomic”: indivisible, and each transaction is “atomic”. Atomicity is a short-hand way of saying that when a transaction that changes the data occurs, either the whole transaction occurs, or none of the transaction occurs.

Atomicity is implemented in Oracle(and most other DBMSs) by commit and rollback statements. Oracle does not return from a COMMIT statement until the redo log changes corresponding to that commit have been physically written to disk, not just to buffer. Durability, the final property in the ACID test, means that any changes made by committed transactions are persistent: they are permanent, and if a failure occurs, the information in the database will reflect all transactions committed before the failure.

I love the atomicity examples on Mr.Thomas Kyte’s book transactions chapter, really improves understanding;


DROP TABLE t2 PURGE ;
CREATE TABLE t2 ( count INT );
INSERT INTO t2 VALUES ( 0 );

DROP TABLE t1 PURGE ;
CREATE TABLE t1 ( x INT CHECK ( x>0 ) );
CREATE TRIGGER t1_trigger
BEFORE INSERT OR DELETE ON t1
FOR EACH ROW
BEGIN
IF ( INSERTING ) THEN
UPDATE t2 SET count = count + 1;
ELSE
UPDATE t2 SET count = count - 1;
END IF;
DBMS_OUTPUT.PUT_LINE( 'I fired and updated ' || SQL%ROWCOUNT || ' rows' );
END;
/

SET SERVEROUTPUT ON;
INSERT INTO t1 VALUES ( 1 );
I fired and updated 1 rows

1 row created.

INSERT INTO t1 VALUES ( -1 );
I fired and updated 1 rows
INSERT INTO t1 VALUES ( -1 )
*
ERROR at line 1:
ORA-02290: check constraint (HR.SYS_C005726) violated

SELECT * FROM t2;

COUNT
----------
1

Initially the row in t2 was 0, after the first successful insert the row in t2 incremented by trigger to value 1. With the second insert 02290 error recieved, since the trigger is a part of the atomicity it is also rollbacked with the second insert statement and value on t2 table still remains 1. In this second example we will use a procedure to show the same behavior;


CREATE OR REPLACE PROCEDURE p AS
BEGIN
INSERT INTO t1 VALUES ( 1 );
INSERT INTO t1 VALUES (-1 );
END;
/

SELECT * FROM t1;

X
----------
1

SELECT * FROM t2;

COUNT
----------
1

BEGIN
p;
END;
/
I fired and updated 1 rows
I fired and updated 1 rows
BEGIN
*
ERROR at line 1:
ORA-02290: check constraint (HR.SYS_C005726) violated
ORA-06512: at “HR.P”, line 5
ORA-06512: at line 2

SELECT * FROM t1;

X
----------
1

SELECT * FROM t2;

COUNT
----------
1

This time also the first insert is rollbacked since the procedure is atomic. Ok this time lets see how exception handling affects the behaviour;


BEGIN
p;
EXCEPTION
WHEN OTHERS THEN NULL; -- kara delik
END;
/
I fired and updated 1 rows
I fired and updated 1 rows

PL/SQL procedure successfully completed.

SELECT * FROM t1;

X
----------
1

SELECT * FROM t2;

COUNT
----------
2

Since the constraint exception is handled, Oracle does not rollback this time, value on t2 table is 2. Be carefull with your exception handling strategy, these give flexablity to developer but it may break the atomicity if used uncarefully and cause corrupted data.

And especially “when others” exception block with “null” implementation must only be used for testing pusposes, it is as dangerous as a bomb in your code..

Another great example is this update statement for atomicity;


DROP TABLE t3 PURGE ;
CREATE TABLE t3 ( x INT UNIQUE );
INSERT INTO t3 VALUES ( 1 );
INSERT INTO t3 VALUES ( 2 );

UPDATE t3 SET x = x + 1;
commit ;

select * from t3;

X
----------
2
3

Because of the unique constraint Oracle would not be able to do this update if violation was checked after each row.

10- NULL values and Oracle

Allowing NULL values into your columns introduces a whole new degree of uncertainty into your database. The primary reason behind is comparing NULL to a value will result neither TRUE or FALSE, in fact we may say it is UNKNOWN. Here is a simple demostration;


set serveroutput on
create or replace procedure IsNotApple(b varchar2) as
begin
if b !='Apple' then
dbms_output.put_line('Not Apple');
else
dbms_output.put_line('Apple');
end if;
end;
/

begin
isNotApple(Null);
end;
/
Apple

PL/SQL procedure successfully completed

Oracle implements NULL values as ANSI SQL requires them to be implemented. But databases are different, experience in one will in part carry over to another. You must be ready for some fundamental differences, such as locking, as well as some very minor differences, such as treatment of Nulls in this example, can have unexpected impacts on your applications. Being aware of the database and how it works and how its features are implemented is the only way to overcome these issues, again another “read the documentation” conclusion :)


drop table tab_null_test purge ;
create table tab_null_test ( team_no number primary key, team_name varchar2(64) unique, availability varchar2(1));
insert into tab_null_test values (1, ‘Bir', NULL);
insert into tab_null_test values (2, ‘Iki', ‘Y');
insert into tab_null_test values (3, ‘Uc', ‘N');
commit;
analyze table tab_null_test compute statistics for table for all columns;

set null ^ -- will print "^" instead of NULL
select * from tab_null_test ;

TEAM_NO TEAM_NAME A
---------- ---------------------------------------------------------------- -
1 Bir ^
2 Iki Y
3 Uc N

select count(team_no) from tab_null_test ;

COUNT(TEAM_NO)
--------------
3

select count(availability) from tab_null_test ;

COUNT(AVAILABILITY)
-------------------
2

select * from tab_null_test where availability <> ‘Y' ;

TEAM_NO TEAM_NAME A
---------- ---------------------------------------------------------------- -
3 Uc N

select * from tab_null_test where nvl(availability, ‘N') <> ‘Y' ;

TEAM_NO TEAM_NAME A
---------- ---------------------------------------------------------------- -
1 Bir ^
3 Uc N

select * from tab_null_test where availability = NULL ;

no rows selected

select * from tab_null_test where availability <> NULL ;

no rows selected

select * from tab_null_test where availability IS NULL ;

TEAM_NO TEAM_NAME A
---------- ---------------------------------------------------------------- -
1 Bir ^

Also if you need to find NULL values with B*tree indexes on Oracle, you must be carefull;


drop table tab_null_test purge;
create table tab_null_test ( x int, y int );
create unique index unique_tab_null_test on tab_null_test(x,y);
insert into tab_null_test values ( 1, 1 );
insert into tab_null_test values ( 1, NULL );
insert into tab_null_test values ( NULL, 1 );
insert into tab_null_test values ( NULL, NULL );
analyze table tab_null_test compute statistics for table for all columns;
analyze index unique_tab_null_test validate structure;
select name, lf_rows from index_stats;

NAME LF_ROWS
------------------------------ ----------
UNIQUE_TAB_NULL_TEST 3

The table has four rows, whereas the index only has three. The first three rows, where at least one of the index key elements was not Null, are in the index. The last row, with (NULL, NULL) is not in the index.


insert into tab_null_test values ( NULL, NULL );

1 row created.

insert into tab_null_test values ( NULL, 1 );
*
ERROR at line 1:
ORA-00001: unique constraint (HR.UNIQUE_TAB_NULL_TEST) violated

insert into tab_null_test values ( 1, NULL );
*
ERROR at line 1:
ORA-00001: unique constraint (HR.UNIQUE_TAB_NULL_TEST) violated

select x, y, count(*)
from tab_null_test group by x,y
having count(*) > 1;

X Y COUNT(*)
---------- ---------- ----------
^ ^ 2

In Oracle, (NULL, NULL) is in deed not equal to (NULL, NULL). That is something to consider: each unique constraint should have at least one NOT NULL column in order to be truly unique.


set autotrace traceonly explain
select * from tab_null_test where x is null;

Execution Plan
----------------------------------------------------------
Plan hash value: 3396731321

-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 6 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TAB_NULL_TEST | 3 | 6 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("X" IS NULL)

set autotrace off
drop table tab_null_test purge;
create table tab_null_test ( x int, y int NOT NULL );
create unique index unique_tab_null_test on tab_null_test(x,y);
insert into tab_null_test values ( 1, 1 );
insert into tab_null_test values ( NULL, 1 );
analyze table tab_null_test compute statistics for all columns;
set autotrace traceonly explain
select * from tab_null_test where x is null;

Execution Plan
----------------------------------------------------------
Plan hash value: 2452341312
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 2 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| UNIQUE_TAB_NULL_TEST | 1 | 4 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("X" IS NULL)

NULLS can also affect the outcome of the analytic functions, especially when you use a descending sort. By default, NULLS are greater than any other value;


set autot off
drop table emp purge ;
create table emp as select employee_id empno,
first_name||' ‘||last_name ename, commission_pct comm from employees
order by employee_id ;
analyze table emp compute statistics ;
select ename, comm from emp order by comm desc;

ENAME COMM
---------------------------------------------- ----------
Steven King ^
Neena Kochhar ^
Lex De Haan ^
Alexander Hunold ^
Bruce Ernst ^
David Austin ^
Valli Pataballa ^
Diana Lorentz ^
Nancy Greenberg ^
Daniel Faviet ^
John Chen ^
...
William Smith ,15
Oliver Tuvault ,15
Mattea Marvins ,1
David Lee ,1
Sundar Ande ,1
Charles Johnson ,1
Sundita Kumar ,1
Amit Banda ,1

107 rows selected.

select ename, comm, dr
from (select ename, comm, dense_rank() over (order by comm desc) dr from emp)
where dr <= 5 order by comm;

ENAME COMM DR
---------------------------------------------- ---------- ----------
David Bernstein ,25 5
Peter Hall ,25 5
Clara Vishney ,25 5
Sarath Sewall ,25 5
Alyssa Hutton ,25 5
Lisa Ozer ,25 5
Peter Tucker ,3 4
Lindsey Smith ,3 4
Ellen Abel ,3 4
Gerald Cambrault ,3 4
Alberto Errazuriz ,3 4
Karen Partners ,3 4
Louise Doran ,3 4
Patrick Sully ,35 3
Janette King ,35 3
Allan McEwen ,35 3
John Russell ,4 2
Steven King ^ 1
Neena Kochhar ^ 1
Lex De Haan ^ 1
Alexander Hunold ^ 1
Bruce Ernst ^ 1
...

While this may technically be “correct” it probably is not what you really wanted. You would either not want NULLS to be considered at all or to have NULLS be interpreted as “small” in this case. So, we can either remove NULLS from consideration, using “where comm is not null”:


select ename, comm, dr
from (select ename, comm, dense_rank() over (order by comm desc) dr
from emp where comm is not null)
where dr <= 5 order by comm desc;

ENAME COMM DR
---------------------------------------------- ---------- ----------
John Russell ,4 1
Janette King ,35 2
Allan McEwen ,35 2
Patrick Sully ,35 2
Karen Partners ,3 3
Alberto Errazuriz ,3 3
Peter Tucker ,3 3
Gerald Cambrault ,3 3
Louise Doran ,3 3
Ellen Abel ,3 3
Lindsey Smith ,3 3
...

or we can use the NULLS LAST extension to the ORDER BY clause:


select ename, comm, dr
from (select ename, comm,dense_rank() over (order by comm desc nulls last) dr
from emp)
where dr <= 5 order by comm desc;

ENAME COMM DR
---------------------------------------------- ---------- ----------
John Russell ,4 1
Janette King ,35 2
Patrick Sully ,35 2
Allan McEwen ,35 2
Louise Doran ,3 3
Ellen Abel ,3 3
Peter Tucker ,3 3
Gerald Cambrault ,3 3
Lindsey Smith ,3 3
Karen Partners ,3 3
Alberto Errazuriz ,3 3
...

It should be noted that the NULLS LAST works on a normal ORDER BY as well; its use is not limited to the analytic functions.

Also for aggreagate functions like sum, count, avg will be affected by NULL values. For your need you may prefer to use NVL function;


SQL> select avg(salary), avg(nvl(salary, 0)) from hr.employees ;

AVG(SALARY) AVG(NVL(SALARY,0))
----------- ------------------
9304,822429   9304,82242990654

SQL> update hr.employees set salary = NULL where employee_id = 100 ;

1 row updated

SQL> select avg(salary), avg(nvl(salary, 0)) from hr.employees ;

AVG(SALARY) AVG(NVL(SALARY,0))
----------- ------------------
9066,566037   8981,83177570093

Continue reading with part 4

Testing Information : the scripts mentioned are tested on Oracle Database 10g Express Edition Release 10.2.0.1.0

Refences Used :
“Null” thread on AskTom
Expert one-on-one Oracle by Thomas Kyte Database Independence?, Page 57 – Indexes and Nulls, Page 378 – NULLS and Sorting, Page 731
http://www.databasedesign-resource.com/null-values-in-a-database.html
Nulls: Nothing to Worry About By Lex de Haan and Jonathan Gennick
Title is NULL by APC

Next Page »

Create a free website or blog at WordPress.com.