Universal mantra on data intensive processing advices that “You should do it in a single SQL statement if at all possible.”
So learning all there is to learn about SQL is our primary goal for developing successful database applications. This series will be my favorite I know from the start :)
1- Using SQL to Generate SQL
SQL can be used to generate scripts in SQL. The data dictionary, is a collection of tables and views that contain database information which are created and maintained by the Oracle server.
SELECT 'CREATE TABLE ' || table_name || '_STRUCTURE_CLONE ' || 'AS SELECT * FROM ' || table_name ||' WHERE 1=2;' AS "Create Table Script" FROM user_tables;
Another example in SQL*PLUS can be Dumping the Contents of a Table to a File;
SET HEADING OFF ECHO OFF FEEDBACK OFF SET PAGESIZE 0 SELECT 'INSERT INTO departments_test VALUES (' || department_id || ', ''' || department_name || ''', ''' || location_id || ''');' AS "Insert Statements Script" FROM departments / SET PAGESIZE 24 SET HEADING ON ECHO ON FEEDBACK ON
This script’s output can be spooled to a file and this file can be executed after the dynamic content prepared. For example a daily backup procedure for parameter tables.
2- Examples on SQL efficiency
a) To improve SQL efficiency, use equijoins whenever possible. Statements that perform equijoins on untransformed column values are the easiest to tune.
Use untransformed column values. For example, use:
WHERE a.order_no = b.order_no
WHERE TO_NUMBER (SUBSTR(a.order_no, INSTR(b.order_no, ‘.’) – 1)) = TO_NUMBER (SUBSTR(a.order_no, INSTR(b.order_no, ‘.’) – 1))
b) SQL is not a procedural language. Using one piece of SQL to do many different things usually results in a less-than-optimal result for each task. If you want SQL to accomplish different things, then write various statements, rather than writing one statement to do different things depending on the parameters you give it.
SELECT info FROM tables WHERE ... AND somecolumn BETWEEN DECODE(:loval, 'ALL', somecolumn, :loval) AND DECODE(:hival, 'ALL', somecolumn, :hival);
It is always better to write separate SQL statements for different tasks, but if you must use one SQL statement, then you can make a very complex statement slightly less complex by using the UNION ALL operator.
Optimization (determining the execution plan) takes place before the database knows what values will be substituted into the query. An execution plan cannot, therefore, depend on what those values are.
SELECT /* change this half of UNION ALL if other half changes */ info FROM tables WHERE ... AND somecolumn BETWEEN :loval AND :hival AND (:hival != 'ALL' AND :loval != 'ALL') UNION ALL SELECT /* Change this half of UNION ALL if other half changes. */ info FROM tables WHERE ... AND (:hival = 'ALL' OR :loval = 'ALL');
c) Write your WHERE clause in order to use a possible index;
WHERE SUBSTR(ACCOUNT_NAME,1,7) = ‘CAPITAL’;
WHERE AMOUNT != 0;
WHERE TRUNC(TRANS_DATE) = TRUNC(SYSDATE);
WHERE ACCOUNT_NAME || ACCOUNT_TYPE = ‘AMEXA’;
WHERE AMOUNT + 3000 < 5000;
WHERE ACCOUNT_NAME = NVL(:ACC_NAME, ACCOUNT_NAME);
WHERE ACCOUNT_NAME LIKE ‘CAPITAL%’;
WHERE AMOUNT > 0 ;
WHERE TRANS_DATE BETWEEN TRUNC(SYSDATE) AND TRUNC(SYSDATE) + .99999;
WHERE ACCOUNT_NAME =’AMEX’ AND ACCOUNT_TYPE = ‘A’ ;
WHERE AMOUNT < 2000;
WHERE ACCOUNT_NAME LIKE NVL(:ACC_NAME, ‘%’);
d) Data Casting Problems;
If EMP_TYPE is a VARCHAR2 column, the following statement will not use an index.
SELECT . . . FROM EMP WHERE EMP_TYPE = 123
This statement will be processed as:
SELECT . . . FROM EMP WHERE TO_NUMBER(EMP_TYPE) = 123;
e) NOT, != and <> disable index use;
WHERE city not in (‘DALLAS’, ‘HOUSTON’)
WHERE city != ‘DALLAS’
3- Using Explicit Default Values
create table dept_tong as select * from departments ; alter table dept_tong modify manager_id default 5 ; -- DEFAULT with INSERT INSERT INTO dept_tong (department_id, department_name, manager_id) VALUES (300, 'Engineering', DEFAULT); -- DEFAULT with UPDATE UPDATE dept_tong SET manager_id = DEFAULT WHERE department_id = 10;
4- Referencing the Same Subquery Multiple Times with the WITH clause
Using the WITH clause, you can use the same query block in a SELECT statement when it occurs more than once within a complex query. The WITH clause retrieves the results of a query block and stores it in the user’s temporary tablespace.
In complex queries that process the same subquery multiple times, you might be tempted to store the subquery results in a temporary table and perform additional queries against the temporary table. The WITH clause lets you factor out the subquery, give it a name, then reference that name multiple times within the original complex query.
This technique lets the optimizer choose how to deal with the subquery results, whether to create a temporary table or inline it as a view.
For example, the following query joins two tables and computes the aggregate SUM(SAL) more than once;
drop table emp purge ; drop table dept purge ; create table emp nologging as select rownum empno, name ename, line sal, rownum deptno from all_source ; create table dept nologging as select rownum deptno, name dname from all_source ; SELECT dname, SUM(sal) AS dept_total FROM emp, dept WHERE emp.deptno = dept.deptno GROUP BY dname HAVING SUM(sal) > ( SELECT SUM(sal) * 1/3 FROM emp, dept WHERE emp.deptno = dept.deptno ) ORDER BY SUM(sal) DESC;
You may improve the query by doing the subquery once, and referencing it at the appropriate points in the main query. The bold text represents the common parts of the subquery, and the places where the subquery is referenced.
set serveroutput on exec runstats_pkg.rs_start; SELECT dname, SUM(sal) AS dept_total FROM emp, dept WHERE emp.deptno = dept.deptno GROUP BY dname HAVING SUM(sal) > ( SELECT SUM(sal) * 1/3 FROM emp, dept WHERE emp.deptno = dept.deptno ) ORDER BY SUM(sal) DESC; exec runstats_pkg.rs_middle; WITH summary AS ( SELECT dname, SUM(sal) AS dept_total FROM emp, dept WHERE emp.deptno = dept.deptno GROUP BY dname ) SELECT dname, dept_total FROM summary WHERE dept_total > ( SELECT SUM(dept_total) * 1/3 FROM summary ) ORDER BY dept_total DESC; exec runstats_pkg.rs_stop(100000); Run1 ran in 53 hsecs Run2 ran in 40 hsecs run 1 ran in 132,5% of the time Name Run1 Run2 Diff STAT...physical read total byt 0 122,880 122,880 STAT...table scan rows gotten 618,444 313,900 -304,544 STAT...session pga memory 0 327,680 327,680 Run1 latches total versus runs -- difference and pct Run1 Run2 Diff Pct 5,922 3,475 -2,447 170.42% PL/SQL procedure successfully completed.
5- Difference Between UNION and UNION ALL
The problem is that in a UNION, Oracle finds all the qualifying rows and then “deduplicates” them;
select * from dual union select * from dual; D - X select * from dual union ALL select * from dual; D - X X
If you can use UNION ALL, by all means use it over UNION to avoid a costly deduplication step, a step that is probably not even necessary most of the time.
6- The WITH CHECK OPTION Keyword
A subquery is used to identify the table and columns of the DML statement. The WITH CHECK OPTION keyword prohibits you from changing rows that are not in the subquery.
Specify WITH CHECK OPTION to indicate that, if the subquery is used in place of a table in an INSERT, UPDATE, or DELETE statement, no changes that produce rows that are not included in the subquery are permitted to that table.
In the example shown, the WITH CHECK OPTION keyword is used. The subquery identifies rows that are in department 50, but the department ID is not in the SELECT list, and a value is not provided for it in the VALUES list. Inserting this row results in a department ID of null, which is not in the subquery.
INSERT INTO (SELECT employee_id, last_name, email, hire_date, job_id, salary FROM empl3 WHERE department_id = 50 WITH CHECK OPTION) VALUES (99998, 'Smith', 'JSMITH', TO_DATE('07-JUN-99', 'DD-MON-RR'), 'ST_CLERK', 5000); ERROR at line 1: ORA-01402: view WITH CHECK OPTION where-clause violation
7- Updating and Inserting a Row Using a Record
Although you can enumerate each field of a PL/SQL record when inserting or updating rows in a table, the resulting code is not especially readable or maintainable. Instead, you can use PL/SQL records directly in these statements. The most convenient technique is to declare the record using a %ROWTYPE attribute, so that it has exactly the same fields as the SQL table.
SELECT * FROM emp WHERE empno = 110; EMPNO ENAME DEPTNO ———- ———————————————- ———- 110 John Chen 100 SELECT * FROM emp WHERE empno = 1500; no rows selected DECLARE emp_rec emp%ROWTYPE; BEGIN emp_rec.empno := 1500; emp_rec.ename := ‘Steven Hill’; emp_rec.deptno := 10; – — A %ROWTYPE value can fill in all the row fields. – INSERT INTO emp VALUES emp_rec; – — The fields of a %ROWTYPE can completely replace the table columns. – UPDATE emp SET ROW = emp_rec WHERE empno = 110; END; / SELECT * FROM emp WHERE empno = 110; no rows selected SELECT * FROM emp WHERE empno = 1500; EMPNO ENAME DEPTNO ———- ———————————————- ———- 1500 Steven Hill 10 1500 Steven Hill 10
8- MERGE Statement
Provides the ability to conditionally update or insert data into a database table, performs an UPDATE if the row exists, and an INSERT if it is a new row:
- Avoids separate updates,
- Increases performance and ease of use,
- Is useful in data warehousing applications
MERGE INTO empl3 c USING employees e ON (c.employee_id = e.employee_id) WHEN MATCHED THEN UPDATE SET c.first_name = e.first_name, c.last_name = e.last_name, ... c.department_id = e.department_id WHEN NOT MATCHED THEN INSERT VALUES(e.employee_id, e.first_name, e.last_name, e.email, e.phone_number, e.hire_date, e.job_id, e.salary, e.commission_pct, e.manager_id, e.department_id);
Or another example which can be used in an Extract Transform Load(ETL) process of a datawarehouse may be;
alter session enable parallel dml; MERGE /*+ parallel(contract_dim,10) append */ INTO contracts_dim d USING (select * from table(transform_pkg.go(CURSOR(select /*+ parallel(contracts_file,10) full(contracts_file) */ * from contracts_file)))) f ON (d.contract_id = f.contract_id) WHEN MATCHED THEN update set descrip = f.descrip, init_val_loc_curr = f.init_val_loc_curr, init_val_adj_amt = f.init_val_adj_amt WHEN NOT MATCHED THEN insert values ( f.contract_id, f.descrip, f.init_val_loc_curr, f.init_val_adj_amt) ;
9- Multitable Inserts
Multitable inserts allow a single INSERT INTO .. SELECT statement to conditionally, or non-conditionally, insert into multiple tables. This statement reduces table scans and PL/SQL code necessary for performing multiple conditional inserts compared to previous versions.
It’s main use is for the ETL process in data warehouses where it can be parallelized and/or convert non-relational data into a relational format:
set pagesize 0 drop table emp_40 purge; drop table emp_50 purge; drop table emp_left_over purge; create table emp_40 nologging parallel as SELECT EMPLOYEE_ID,FIRST_NAME,LAST_NAME,DEPARTMENT_ID FROM employees where 1 = 2; create table emp_50 nologging parallel as SELECT EMPLOYEE_ID,EMAIL,PHONE_NUMBER,HIRE_DATE,DEPARTMENT_ID FROM employees where 1 = 2; create table emp_left_over nologging parallel as SELECT EMPLOYEE_ID,FIRST_NAME,LAST_NAME,EMAIL,PHONE_NUMBER,HIRE_DATE,JOB_ID ,SALARY,COMMISSION_PCT,MANAGER_ID,DEPARTMENT_ID FROM employees where 1 = 2; SELECT DEPARTMENT_ID,count(*) FROM employees group by DEPARTMENT_ID order by DEPARTMENT_ID; ALTER SESSION ENABLE PARALLEL DML; INSERT /*+ append parallel(2) */ ALL WHEN (DEPARTMENT_ID=40) THEN INTO emp_40 (EMPLOYEE_ID,FIRST_NAME,LAST_NAME,DEPARTMENT_ID) VALUES (EMPLOYEE_ID,FIRST_NAME,LAST_NAME,DEPARTMENT_ID) WHEN (DEPARTMENT_ID=50) THEN INTO emp_50 (EMPLOYEE_ID,EMAIL,PHONE_NUMBER,HIRE_DATE,DEPARTMENT_ID) VALUES (EMPLOYEE_ID,EMAIL,PHONE_NUMBER,HIRE_DATE,DEPARTMENT_ID) ELSE INTO emp_left_over (EMPLOYEE_ID,FIRST_NAME,LAST_NAME,EMAIL,PHONE_NUMBER,HIRE_DATE,JOB_ID,SALARY,COMMISSION_PCT,MANAGER_ID,DEPARTMENT_ID) VALUES (EMPLOYEE_ID,FIRST_NAME,LAST_NAME,EMAIL,PHONE_NUMBER,HIRE_DATE,JOB_ID,SALARY,COMMISSION_PCT,MANAGER_ID,DEPARTMENT_ID) SELECT /*+ PARALLEL(b 2) */ * FROM employees b ; commit; SELECT DEPARTMENT_ID,count(*) FROM emp_40 group by DEPARTMENT_ID order by DEPARTMENT_ID; 40 1 1 row selected. SELECT DEPARTMENT_ID,count(*) FROM emp_50 group by DEPARTMENT_ID order by DEPARTMENT_ID; 50 45 1 row selected. SELECT DEPARTMENT_ID,count(*) FROM emp_left_over group by DEPARTMENT_ID order by DEPARTMENT_ID; 10 1 20 2 30 6 60 5 70 1 80 34 90 3 100 6 110 2 1 10 rows selected.
The different types of multitable INSERT statements are:
- Unconditional INSERT
- Conditional ALL INSERT
- Conditional FIRST INSERT
- Pivoting INSERT
Also Exception Logging can be used with multi table insert;
INSERT /*+ append parallel(2) */ ALL WHEN (DEPARTMENT_ID=40) THEN INTO emp_40 (EMPLOYEE_ID,FIRST_NAME,LAST_NAME,DEPARTMENT_ID) VALUES (EMPLOYEE_ID,FIRST_NAME,LAST_NAME,DEPARTMENT_ID) LOG ERRORS REJECT LIMIT UNLIMITED WHEN (DEPARTMENT_ID=50) THEN INTO emp_50 (EMPLOYEE_ID,EMAIL,PHONE_NUMBER,HIRE_DATE,DEPARTMENT_ID) VALUES (EMPLOYEE_ID,EMAIL,PHONE_NUMBER,HIRE_DATE,DEPARTMENT_ID) LOG ERRORS REJECT LIMIT UNLIMITED ELSE ...
Pivoting INSERT example;
INSERT ALL INTO sales_info VALUES (employee_id,week_id,sales_MON) INTO sales_info VALUES (employee_id,week_id,sales_TUE) INTO sales_info VALUES (employee_id,week_id,sales_WED) INTO sales_info VALUES (employee_id,week_id,sales_THUR) INTO sales_info VALUES (employee_id,week_id, sales_FRI) SELECT EMPLOYEE_ID, week_id, sales_MON, sales_TUE,sales_WED, sales_THUR,sales_FRI FROM sales_source_data;
Testing Information : the scripts mentioned are tested on Oracle Database 10g Express Edition Release 10.2.0.1.0
References Used :
Oracle® Database SQL Reference 10g Release 2 (10.2)
On Partitioning and Pipelining Oracle Magazine Article By Tom Kyte