Database Performance and SQL Tuning Checklist

Database Performance Checklist
• Set the minimal number of initialization parameters. Ideally, most initialization parameters should be left at default. If there is more tuning to perform, this shows up when the system is under load. Set storage options for tables and indexes in appropriate tablespaces.
• Verify that all SQL statements are optimal and understand their resource usage.
• Validate that middleware and programs that connect to the database are efficient in their connection management and do not log on and log off repeatedly.
• Validate that the SQL statements use cursors efficiently. Each SQL statement should be parsed once and then executed multiple times. The most common reason this does not happen is because bind variables are not used properly and WHERE clause predicates are sent as string literals.
• Validate that all schema objects have been correctly migrated from the development environment to the production database. This includes tables, indexes, sequences, triggers, packages, procedures, functions, Java objects, synonyms, grants, and views. Ensure that any modifications made in testing are made to the production system.
• As soon as the system is rolled out, establish a baseline set of statistics from the database and operating system. This first set of statistics validates or corrects any assumptions made in the design and rollout process.

SQL Tuning Checklist
1. Identify Statements to tune using:
• EM Top SQL
• Statspack

2. View execution statistics using:
• SQL*trace
• Make note of
• CPU time
• Elapsed time
• Disk reads
• Disk sorts

3. Tune SQL automatically using SQL Tuning Advisor.
• Optimizer stats analysis
• SQL Profiling
• Index analysis
• SQL restructure

4. Tune SQL manually:
a. Gather information about the underlying objects used in the SQL statements
• Obtaind table, index and column definitions
• Obtain view definitions
• Understand column data distribution
o Uniqueness
o Nulls
o Skew
• Identify if data from more than one table is required by a statement resulting in joins
• Verify the join predicates to avoid cartesian joins
• Verify presence of indexes
• Verify presence of Materialized views
• Verify type of indexes
b. Verify execution plans by using:
• Explain plan
• SQL*Plus Auto trace
c. Verify statistics in:
• User_tab_columns
• User_indexes
• User_tables
• Look for :
• Last analyzed
• Existence of histograms where appropriate

5. Verify tha statistics are current:
• Use Automatic statistics gatheringset to appropriate intervals
• Backup existing statistics before gathering new ones
• Use DBMS_STATS package to gather statistics where statistics are stale or absent
• Use Dynamic sampling on volatile objects if needed

6. Change access paths.
• Use SQL Access Advisor
• Use SQL Tuning Advisor
• Create B*tree indexes on highly selective data.
• Create bitmap indexes on low cardinality columns.
• Bitmap indexes help in queries using OR or aggregates.
• Create bitmap join indexes to facilitate joins.
• Create concatenated indexes to facilitate full index scans.
• Create histograms on skewed data.
• Create materialized views on queries involving joins and aggregates.
• Keep in mind that :
• Full table scans on small tables or queries retrieving a large percentage of rows are OK
• A full index scan may be faster than a full table scan
• An index skip scan may be faster to a full index scan
• An index access by rowid may be faster to an index range scan
• Look for distinct or GROUP By as this may indicate a mising predicate

7. Restructure queries keeping the following in mind:
• Use SQL Tuning advisor.
• Inequality conditions cannot use indexes.
• Distinct causes sorts.
• Group by causes sorts.
• Aggregates can use indexes.
• Applying functions on indexed columns prevents the index from being used.
• Low selectivity queries do not use indexes.
• Use UNION ALL instead of UNION (wherever possible).
• Nesting queries too deeply causes poor performance.
• Use EXISTS instead of IN for subqueries to check for TRUE or FALSE values(wherever possible).
• Use NOT EXISTS instead of NOT IN whenever possible.
• Implicit or explicit conversions may cause an index not to be used.
• OR and IN lists conditions are not performance efficient.
• If possible = or AND conditions are preferable.

8. Use hints to influence the optimizer in choosing:
• Query transformation
• Join orders
• Join methods
• Access paths

9. Verify the new code improves performance
• From a user perspective such response time, timre taken to run a report etc.
• Check that the execution statistics (step 2) reflect the performance gain from the changes you have made in CPU time, elapsed time etc. from a resource uage perspective.

Refences Used :
Oracle® Database Performance Tuning Guide 10g Release 2 (10.2)


Oracle Concepts and Architecture – Part 2

In part 1 I started to discuss below subjects;

1- Flashback Versions Query, Drop Table and Recyclebin Management
2- How shared pool works in Oracle and binding example
3- The Storage Hierarchy Summary in an Oracle Database
4- Concurrency and Isolation Levels Concepts
5- Automated Checkpoint Tuning and Mean time to recover(MTTR) Concepts
6- Concurrency and Consistency Concepts
7- Latch and Lock Concepts

I will continue with an Oracle Database Architecture overview;

8- Oracle Database Architecture: Overview

a) Oracle database and an Oracle instance

The database consists of physical structures such as;
Control files: These files contain data about the database itself. These files are critical to the database. Without them you cannot open the database. Check view v$controlfile for your controlfile settings.
Data files: These files contain the data of the database. Check view v$datafile for your data file settings.
Online redo log files: These files allow for instance recovery of the database. If the database were to crash, the database can be recovered with the information in these files. Check view v$logfile and v$log for your redo log file settings.

There are other files that are not officially part of the database but are important to the successful running of the database:
Parameter file: The parameter file is used to define how the instance is configured on startup. Check view v$parameter for your parameter settings.
Password file: This file enables super users to connect remotely to the database and perform administrative tasks.
Archive log files: These files ensure database recovery and are copies of the online redo log files. Using these files and a backup of the database, you can recover a lost data file. Check view v$archived_log for your redo log file settings.

The instance consists of memory structures such as System Global Area(SGA) and Program Global Area(PGA) and background processes that perform tasks within the database as well as the server processes that are initiated for each user session.

The size and structure of an Oracle database and instance impact performance. The physical structure of the database impacts the I/O to hard disks. It is therefore important to both size and place the physical files in such a way that I/O across disks is distributed evenly and waits are minimized. The size of the various memory areas of the instance directly impacts the speed of SQL processing.

The default Oracle database, created by the Oracle Universal Installer(OUI), is preconfigured with initial settings for the memory parameters. The performance of the database depends on the sizing of these memory parameters, so you should fine tune it to meet the requirements of your growing database.

Two memory parameters, PGA_AGGREGATE_TARGET and SGA_TARGET, are provided that allow the database to automatically resize the memory structures within the SGA and PGA. These parameters can be set based on the recommendations of Automatic Database Diagnostics Monitor(ADDM), which is available with the Enterprise Edition of Oracle Database 10g, or you can manually run several advisors and use the combined recommendations of these advisors to set the sizes appropriately.

The basic memory structures associated with an Oracle instance include:
System Global Area(SGA): Shared by all server and background processes
Program Global Area(PGA): Exclusive to each server and background process. There is one PGA for each server process.

The System Global Area(SGA) consists of the following data structures:
Database buffer cache: Caches blocks of data retrieved from the data files
Redo log buffer: Caches redo information (used for instance recovery) until it can be written to the physical redo log files stored on disk
Shared pool: Caches various constructs that can be shared among users
Large pool: Optional area in the SGA that provides large memory allocations for Oracle backup and restore operations, I/O server processes, and session memory for the shared server
Java pool: Used for all session-specific Java code and data within the Java Virtual Machine (JVM)
Streams pool: Used by Oracle Streams

The Program Global Area(PGA) is a memory region which contains data and control information for each server process. A server process is a process that services a client’s requests. Each server process has its own private PGA that is created when the server process is started. Only a server process can access its own PGA. Generally, the PGA contains the following:
Private SQL area: Contains data such as bind information and run-time memory structures. Each session that issues a SQL statement has a private SQL area.
Session memory: Memory allocated to hold session variables and other information related to the session

b) Connecting to an Instance

When a user starts a tool such as SQL*Plus or connects to the database using an application, the application or tool is executed in a user process. When a user actually logs on to the Oracle database, a process is created on the computer running the Oracle database. The listener on the Oracle database actually establishes the connection and directs the request to an available server process. The server process communicates with the Oracle instance on behalf of the user process that runs on the client. The server process executes SQL statements on behalf of the user.

A connection is a communication pathway between a user process and an Oracle database. A database user can connect to an Oracle database in one of three ways:
• The user logs on to the machine running the Oracle instance and starts an application or tool that accesses the database on that system. The communication pathway is established using the interprocess communication mechanisms available on the host operating system.
• The user starts the application or tool on a local computer and connects over a network to the computer running the Oracle instance. In this configuration, called client/server, network software is used to communicate between the user and the Oracle database.
• In a three-tiered connection, the user’s computer communicates over the network to an application or a network server, which is connected through a network to the machine running the Oracle instance. For example, the user runs a browser on a network computer to use an application residing on an NT server that retrieves data from an Oracle database running on a UNIX host.

A session is a specific connection of a user to an Oracle database. The session starts when the user is validated by the Oracle database, and it ends when the user logs out or when there is an abnormal termination. For a given database user, many concurrent sessions are possible if the user logs on from many tools, applications, or terminals at the same time. Except for some specialized database administration tools, starting a database session requires that the Oracle database be available for use.

Good database connection management offers benefits in minimizing the number of connections, thereby increasing scalability.

c) Fast COMMIT and System Change Number

The Oracle Database uses a Fast COMMIT mechanism that guarantees the committed changes can be recovered in case of instance failure. Whenever a transaction commits, the Oracle Database assigns a commit system change number(SCN) to the transaction. The SCN is monotonically incremented and is unique within the database. It is used by the Oracle Database as an internal time stamp to synchronize data and to provide read consistency when data is retrieved from the data files. Using the SCN enables the Oracle Database to perform consistency checks without depending on the date and time of the operating system.

When a COMMIT is issued, the following steps are performed:
• The server process places a commit record, along with the SCN, in the redo log buffer.
• The background Log Writer process (LGWR) performs a contiguous write of all the redo log buffer entries up to and including the commit record to the redo log files. After this point, the Oracle Database can guarantee that the changes will not be lost even if there is an instance failure.
• The server process provides feedback to the user process about the completion of the transaction.

DBWR eventually writes the actual changes back to disk based on its own internal timing mechanism.

d) Factors to be Managed and Top Oracle Performance Issues on this Architecture

Performance management can be divided into the following four areas. Although the areas are separate, they are also interdependent and require different skill sets;

Schema tuning deals with the physical structure of the data. If an application has inadequate or inappropriate data design, then tuning the physical allocation, providing indexes, or rewriting programs will not overcome the problem.

Application tuning deals with such business requirements as 24/7 availability, OLAP, OLTP, and so on as well as the program modules or applications that implement the functions. Tuning the procedural code for the type of application and tuning the embedded SQL statements are also included in this factor. If an application is well designed, it may still perform badly. A common reason for this is badly written SQL.

Instance tuning deals with the configuration of the Oracle server for memory utilization.

Database tuning deals with managing the physical arrangement of data on the disk.

User expectations: Usually users expect consistent performance on all applications. However, they may accept certain applications (such as OLAP operations) as slower if the project team builds realistic user expectations. An application may include messages to warn operators that they are requesting resource-intensive operations. The best time to do this is before the design and build phases and as part of the transition phase.–

Hardware and network tuning deals with performance issues arising from the CPU and from network traffic on all machines supporting the application. The main hardware components are:
CPU: There can be one or more CPUs, and they can vary in processing power from simple CPUs found in hand-held devices to high-powered server CPUs. Sizing of other hardware components is usually a multiple of the CPUs on the system.
Memory: Databases require considerable amounts of memory to cache data and avoid time-consuming disk access.
I/O subsystem: The I/O subsystem can vary between the hard disk on a client PC and high-performance disk arrays. Disk arrays can perform thousands of I/Os each second and provide availability through redundancy in terms of multiple I/O paths and hot pluggable mirrored disks.
Network: The primary concerns with network specifications are bandwidth (volume) and latency (speed).

Top Oracle performance issues reported on metalink Oracle global support site are;

Bad connection management: The application connects and disconnects for each database interaction. This problem is common with stateless middleware in application servers. Additionally, simultaneous connections from the same client are also a waste of system and network resources.

Poor use of cursors and the shared pool: Not reusing cursors results in repeated parses. If bind variables are not used, then there is hard parsing of all SQL statements. This has an order-of-magnitude impact in performance, and it is totally unscalable. Use cursors with bind variables that open the cursor and execute it many times. Be suspicious of applications generating dynamic SQL.

Bad SQL: Bad SQL is SQL that uses more resources than appropriate for the application requirement. This can be a decision support systems (DSS) query that runs for more than 24 hours or a query from an online application that takes more than a minute. SQL that consumes significant system resources should be investigated for potential improvement. ADDM identifies high-load SQL, and the SQL Tuning Advisor can be used to provide recommendations for improvement.

Use of nonstandard initialization parameters: These might have been implemented based on poor advice or incorrect assumptions. Most systems will give acceptable performance using only the set of basic parameters. In particular, parameters associated with SPIN_COUNT on latches and undocumented optimizer features can cause a great deal of problems that can require considerable investigation.

I/O issues: If you configure your database to use multiple disks by disk space and not I/O bandwidth, then there will be excessive I/O to certain disks and little I/O to others. Frequently and simultaneously accessed objects (a table and its index) should be designed to be stored over different disks.

Long full-table scans: Long full-table scans for high-volume or interactive online operations could indicate poor transaction design, missing indexes, or poor SQL optimization.

In-disk sorting: In-disk sorts for online operations could indicate poor transaction design, missing indexes, or poor SQL optimization. Disk sorts, by nature, are I/O-intensive and unscalable.

High amounts of recursive SQL: Large amounts of recursive SQL executed by SYS could indicate space management activities, such as extent allocations, taking place. This is unscalable and impacts user response time. Recursive SQL executed under another user ID is probably SQL and PL/SQL, and this is not a problem.

Schema errors and optimizer problems: In many cases, an application uses too many resources because the schema owning the tables has not been successfully migrated from the development environment or from an older implementation. Examples of this are missing indexes or incorrect statistics. These errors can lead to suboptimal execution plans and poor interactive user performance. When migrating applications of known performance, you should export the schema statistics to maintain plan stability by using the DBMS_STATS package.

Continue reading with part 3

Refences Used :
Oracle® Database Concepts 10g Release 2 (10.2)