Extract(Read) > Transform(Test&Apply) > Load(Learn) Blog

October 9, 2007

Oracle Best Practices Part 5

Filed under: Oracle Best Practices — H.Tonguç Yılmaz @ 12:00 am

Part 1 best practices -
1- Put your Database into ARCHIVELOG Mode
2- Comment on Tables and Columns similar to your PL/SQL or SQL Source Codes
3- Use DBMS_STATS for statistic collection
4- Using Recovery Manager(RMAN) for Backup and Recovery needs

Part 2 best practices -
5- Resumable Statements and Space Allocation
6- Maintenance of Global Partitioned Indexes
7- Default Tablespaces
8- PL/SQL Bulk Operations
9- Locking Issues
10- Oracle System Event Triggers
11- Autonomous Transactions

Part 3 best practice -
12- Learn what is already provided with PL/SQL language

Part 4 best practice -
13- Defining Application Services for Oracle Database 10g

14- Cost Based Optimizer Best Practices and Advised Strategy

A. When and how to gather statistics for the cost based optimizer?

1. When to gather statistics; some candidate situations are as follows;

– After new database creations,
– After new created and altered segments(tables, indexes, partitions, clusters etc.),
– After hardware upgrades like CPUs, I/O subsystem(gather system statistics),
– After migration from Rule based optimizer(RBO) to Cost based optimizer(CBO),
– After large amounts of data change(bulk operations, loads, purges etc.),
– After new high/low values for keys generated,
– After 10g Optimizer statistics are automatically gathered with the job GATHER_STATS_JOB. This job gathers statistics on all objects in the database which have Missing or Stale statistics,
– Also data dictionary and fixed tables need statistics after 10g.

2. How much to gather; using Sample Sizes are recommended;

– estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE optimal for uniform sizes and distributions of data but problems can occur with skewed data or if there are lot of nulls in the column,
– Smaller sample sizes cane be preferred for large objects to save time and resource consumption,
– Row sampling means full scan of data sampling rows where as block sampling reduces I/O.

3. How to keep the statistics up to date; proposed methodology summary at “A Practical Approach to Optimizer Statistics in 10g by Andrew Holdsworth Director of Real World Performance” is as follows;

- Mentioned best practices in A.1. and A.2. will work well for 90% of your SQLs. In the initial post of this series with part 3 “Use DBMS_STATS for statistic collection” I discussed the advantages of DBMS_STATS over ANALYZE for statistics collection. So prefer dbms_stats package over analyze command(we still need analyze for backward compatibility, validate structure(index_stats information also) and list chained rows options), here is an example of advised general stats gathering method;


BEGIN
   dbms_stats.gather_schema_stats(ownname          => USER,
                                  options          => 'GATHER AUTO',
                                  degree           => 4,
                                  estimate_percent => dbms_stats.auto_sample_size,
                                  cascade          => TRUE,
                                  method_opt       => 'FOR ALL COLUMNS SIZE AUTO',
                                  granularity      => 'AUTO');
   COMMIT;
END;
/

BEGIN
   dbms_stats.gather_table_stats(ownname          => USER,
                                 tabname          => 'EMPLOYEES',
                                 degree           => 4,
                                 estimate_percent => dbms_stats.auto_sample_size,
                                 cascade          => TRUE,
                                 method_opt       => 'FOR ALL COLUMNS SIZE AUTO',
                                 granularity      => 'AUTO');
   COMMIT;
END;
/

- But for the 9% of your SQLs you may need to refine the stats collection method based on data distribution; manual histogram management and setting statistics etc.
– And for the remaining 1% of your SQLs sample size is irrelevant, you have to get assistance of options like hints, outlines and sql profiles on 10g. These options must be preferred only as a last resort.
– Do not fall into the parameter change trap;
step 1) Bad plan discovered on Query A,
step 2) Optimizer parameter changed to fix Query A,
step 3) This change causes bad plan discovery on Query B,
step 4) than another change on optimizer parameter to fix Query B brings you back to step 1 :)

Use 10046 and 10053 events for detail analysis of CBO decisions, changing the value of an optimizer parameter for a query will cause new problems.

B. Ensuring representive stats to guarantee good performance

To determining the non-representive stats basic analysis are;
Step 1) Check if the rows estimate from explain plan and v$sql_plan_statistics_all are correct,
a. V$SQL_PLAN_STATISTICS_ALL columns to compare;
CARDINALITY = optimizer row estimate
LAST_OUTPUT_ROWS = actual row count
b. e-rows and a-rows columns in the explain plan taken with below steps;

-- If you want to call this function, you need access to several of the dynamic performance views -v$session, v$sql, v$sql_plan and v$sql_plan_statistics_all
-- set serveroutput off it is enabled
-- The hint /*+ gather_plan_statistics */ enables collection of rowsource execution statistics for the statement
-- Reference : http://jonathanlewis.wordpress.com/2006/11/09/dbms_xplan-in-10g/
--
set serveroutput off 
select /*+ gather_plan_statistics */ ...
select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

Step 2) Check if the rows estimate are correct in all_[tab|part]_col_statistics,
Step 3) Check if the sample size large enough to give accurate row estimates,
Step 4) Gather more accurate stats or use dbms_stats.set_…._stats to set the calculated accurate stats and start from the step 1.

C. How to capture SQL workload against schema

1. DML monitoring information is collected and can be monitored at [DBA|ALL|USER]_TAB_MODIFICATIONS;

- This information is needed by dbms_stats to identify the objects with “stale” statistics,
– This option is eabled by default after 10g, but for 9i it must be manually set

alter table table-name monitoring;

- Tracked information can be queried from [DBA|ALL|USER]_TAB_MODIFICATIONS;

select table_name, partition_name, inserts, updates, deletes from all_tab_modifications;
TABLE_NAME PARTITION_NAME INSERTS UPDATES DELETES
---------------------------------------------------------
USERS USERS14 0 3328 0
ACCOUNT ACCOUNT23 23450 738 230
EMP EMP19 2409 390 0

- 9i and 10g use 10% change as the threshold to gather stale stats, this value is not hard-coded after 11g ad can be set with dbms_stats.set_table_prefs

2. Column usage history(predicate information) is also collected and can be monitored at sys.col_usage$;

- This information is again needed by dbms_stats to identify candidate columns on which to build histograms when method_opt=>’for …size auto’
– Tracked information can be queried from sys.col_usage$ and information is persistent after shutdown

Testing Information : the scripts mentioned are tested on Oracle Database 10g Enterprise Edition Release 10.2.0.3.0

References Used :
Oracle® Database Performance Tuning Guide 10g Release 2 (10.2)
Part IV Optimizing SQL Statements
http://structureddata.org/presentations/

About these ads

4 Comments »

  1. How to Display HIGH_VALUE/LOW_VALUE Columns from USER_TAB_COL_STATISTICS

    http://structureddata.org/2007/10/16/how-to-display-high_valuelow_value-columns-from-user_tab_col_statistics/

    Comment by H.Tonguç Yılmaz — October 24, 2007 @ 8:26 am | Reply

  2. […] Continue reading with Part 5 […]

    Pingback by Oracle Best Practices Part 4 « H.Tonguç YILMAZ Oracle Blog — November 22, 2007 @ 7:06 am | Reply

  3. If you have your own statistics gathering strategy it may be better to switch of this default job of Oracle after 10g;

    SELECT owner, job_name,enabled
    FROM DBA_SCHEDULER_JOBS
    WHERE JOB_NAME = ‘GATHER_STATS_JOB';

    BEGIN
    DBMS_SCHEDULER.DISABLE(‘GATHER_STATS_JOB’);
    COMMIT;
    END;
    /

    SELECT owner, job_name,enabled
    FROM DBA_SCHEDULER_JOBS
    WHERE JOB_NAME = ‘GATHER_STATS_JOB';

    Also be careful, DBMS_STATS METHOD_OPT default behaviour changed in 10g;

    http://richardfoote.wordpress.com/2008/01/04/dbms_stats-method_opt-default-behaviour-changed-in-10g-be-careful/

    Comment by H.Tonguç Yılmaz — January 4, 2008 @ 3:00 pm | Reply

  4. select plan_table_output from v$sql s,
    table(dbms_xplan.display_cursor(s.sql_id, s.child_number, ‘ALL’)) t
    where upper(s.sql_text) like ‘SELECT%FROM%EMP%’ ;

    select plan_table_output
    from table(dbms_xplan.display_cursor(NULL, NULL, ‘typical +peeked_binds’));

    Reference : Upgrade from Oracle 9i to 10g: What to expect from the optimizer
    Oracle white paper, Feb 2008

    http://www.oracle.com/technology/products/bi/db/10g/pdf/twp_bidw_optimizer_10gr2_0208.pdf

    Comment by H.Tonguç Yılmaz — February 27, 2008 @ 9:05 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 80 other followers

%d bloggers like this: