Isolating and Tuning Problem SQL

Spotlight on Oracle’s Top SQL drilldown allows you to identify the SQL that is consuming the most resources on your system. Tuning this SQL can be one of the most effective ways of tuning your database. This section contains guidelines on SQL tuning.

 

Spotlight on Oracle – Indexing (SOO21)

Indexing

Indexes exist primarily to improve the performance of SQL statements. In many cases, establishing good indexes is the best path to optimal performance.

 

Spotlight on Oracle – Use Concatenated Indexes (SOO22)

Use concatenated indexes

Try not to use two indexes when one would do. If searching for SURNAME and FIRSTNAME, do not unnecessarily create separate indexes for each column. Instead, create a concatenated index on both SURNAME and FIRSTNAME. You can use the leading portion of a concatenated index on its own. If you sometimes query on the SURNAME column without supplying the FIRSTNAME, then SURNAME should come first in the index.

 

Spotlight on Oracle – Cost-based Optimizer (SOO26)

Taking advantage of the cost-based optimizer

The component of the Oracle software that determines the execution plan for a SQL statement is known as the optimizer. Oracle supports two approaches to query optimization. They are:

  • The rule-based optimizer determines the execution plan based on a set of rules. The rules rank various access paths. For example, an index-based retrieval has a lower rank than a full table scan. A rule-based optimizer uses indexes wherever possible.
  • The cost-based optimizer determines the execution plan based on an estimate of the computer resources (the cost) required to satisfy various access methods. The cost-based optimizer uses statistics (including the number of rows in a table and the number of distinct values in indexes) to determine the optimum plan.

Early experiences with the cost-based optimizer in Oracle 7.0 and 7.1 were often disappointing and gave the cost-based optimizer a bad reputation. However, the cost-based optimizer has been improving in each release. The rule-based optimizer is virtually unchanged since Oracle 7.0. Advanced SQL access methods (such as star and hash joins) are only available when you use the cost-based optimizer.

The cost-based optimizer is the best choice for almost all new projects. Converting from rule to cost-based optimization is worthwhile for many existing projects. 

Consider the following guidelines for getting the most from the cost-based optimizer:

  • OPTIMIZER_MODE. The default mode of the cost-based optimizer (OPTIMIZER_MODE=CHOOSE) attempts to optimize the throughput (that is, the time taken to retrieve all rows) of SQL statements. It often favors full table scans over-index lookups. When converting to cost-based optimization, many users are disappointed to find that previously well-tuned index lookups change to long running table scans. To avoid this, set OPTIMIZER_MODE=FIRST_ROWS in init.ora or ALTER SESSION SET OPTIMIZER_GOAL=FIRST_ROWS in your code. This instructs the cost-based optimizer to minimize the time taken to retrieve the first row in your result set and encourages the use of indexes.
  • Hints. No matter how sophisticated the cost-based optimizer becomes, there are still occasions when you need to modify its execution plan. SQL hints are usually the best way of doing this. By using hints, you can instruct the optimizer to pursue your preferred access paths (such as a preferred index), use the parallel query option, select a join order, and so on. Hints are entered as comments following the first word in a SQL statement. The plus sign (+) in the comment lets Oracle know that the comment contains a hint. Hints are fully documented in the Oracle Server Tuning Guide and some of the most popular hints are summarized in Optimizer hints. In the following example, the hint instructs the optimizer to use the CUST_I2 index:
    SELECT /*+ INDEX(CUSTOMERS CUST_I2) */ *
        FROM CUSTOMERS
        WHERE NAME=:CUST_NAME
        
  • Analyze your tables. The cost-based optimizer’s execution plans are calculated using table statistics collected by the ANALYZE command. Make sure you analyze your tables regularly, that you analyze all your tables, and that you analyze them at peak volumes (for instance, don’t analyze a table just before it is about to be loaded by a batch job). For small to medium tables, use analyze table table_name compute statistics. For larger tables take a sample (for example, analyze table table_name estimate statistics sample 20 percent).
  • Use histograms. Prior to Oracle 7.3, the cost-based optimizer included the number of distinct values in a column, but not the distribution of data within the column. This meant that it might decline to use an index on a column with only a few values, even if the particular value in question was very rare and would benefit from an index lookup. Histograms, introduced in Oracle 7.3, allow column distribution data to be collected, and allow the cost-based optimizer to make better decisions. You can create histograms with the FOR COLUMNS clause of the analyze command (for instance ANALYZE TABLE TABLE_NAME COMPUTE STATISTICS FOR ALL INDEXED COLUMNS). You cannot take advantage of histograms if you are using bind variables.

 

Spotlight on Oracle – Avoid Accidental Table Scans (SOO27)

Avoid accidental table scans

One of the most fundamental SQL tuning problems is the accidental table scan. Accidental table scans usually occur when the SQL programmer tries to perform a search on an indexed column that can’t be supported by an index. This can occur when:

  • Using != (not equals to). Even if the not equals condition satisfies only a small number of rows, Oracle does not use an index to satisfy such a condition. Often, you can re-code these queries using > or IN conditions, which can be supported by index lookups.
  • Searching for NULLS. Oracle won’t use an index to find null values, since null values are not usually stored in an index (the exception is a concatenated index entry where only some of the values are NULL). If you’re planning to search for values that are logically missing, consider changing the column to NOT NULL with a DEFAULT clause. For example, you could set a default value of UNKNOWN and use the index to find these values. Interestingly, recent versions of Oracle can index to find values that are NOT NULL – if the cost-based optimizer determines that such an approach is cost-effective.
  • Using functions on indexed columns. Any function or operation on an indexed column prevents Oracle from using an index on that column. For instance, Oracle can’t use an index to find SUBSTR(SURNAME,1,4)=’SMIT’. Instead of manipulating the column, try to manipulate the search condition. In the previous example, a better formulation would be SURNAME LIKE ‘SMIT%’.

 

Spotlight on Oracle – Optimizing Necessary Table Scans (SOO28)

Optimize necessary table scans

In many cases, avoiding a full table scan by using the best of all possible indexes is your aim. Often though, a full table scan cannot be avoided. In these situations, consider some of the following techniques to improve table scan performance:

 

Spotlight on Oracle – Using Partitioning (SOO2C)

Using partitioning

If the number of rows you want to retrieve from a table is greater than an index lookup could effectively retrieve, but still only a fraction of the table itself (say between 10 and 40% of total), you could consider partitioning the table.

For instance, suppose that a SALES table contains all sales records for the past 4 years and you frequently need to scan all sales records for the current financial year in order to calculate year-to-date totals. The proportion or rows scanned is far greater than an index lookup would comfortably support, but is still only a fraction of the total table.

If you partition the table by financial year, you can restrict processing to only those records that match the appropriate financial year. This could potentially reduce scan time by 75% or more.

In Oracle 7.3, you can create separate tables for each financial year, and then create a partition view. A partition view is a UNION ALL view with CHECK constraints on each table. These enforce the partitioning. Scans on the view that specify a particular financial year clause only need to scan the appropriate table.

In Oracle 8, true partitioned tables can be created. In an Oracle 8 table, partitioned by financial year, rows for the appropriate financial year would be stored in distinct partitions, and the optimizer would restrict queries against a particular financial year to the appropriate partition.

 

Spotlight on Oracle – Using Array Processing

Use array processing

Array processing refers to Oracle’s ability to insert or select more than one row in a single operation. For SQL, which deals with multiple rows of data, array processing usually results in reductions of 50% or more in execution time (more if you’re working across the network). In some application environments, array processing is implemented automatically and you won’t have to do anything to enable this feature. In other environments, array processing must be totally implemented by the programmer.

Many programmers implement huge arrays. This can be excessive and may even reduce performance by increasing memory requirements for the program. Most of the gains of array processing are gained by increasing the array size from 1 to about 20. Further increases result in diminishing gains. You do not normally see much improvement when increasing the array size over 100.

 

Spotlight on Oracle – SQL Tuning Features

Utilizing SQL tuning features 

Each release of Oracle introduces new and improved SQL performance features. In particular the cost-based optimizer contains improvements in each release—many of which are undocumented. Some of the Oracle features that can help your SQL performance are:

  • Hash joins. This join algorithm improves the performance of joins that previously used the sort-merge algorithm. It is  invoked automatically unless HASH_JOIN_ENABLED=FALSE.
  • Anti-joins. The anti-join algorithm allows efficient execution of queries that use NOT IN sub-queries. These types of queries were typically performance problems in earlier versions of Oracle. You can invoke the anti-join with the MERGE_AJ or HASH_AJ hints (in the sub-query), or by setting ALWAYS_ANTI_JOIN=TRUE.
  • Histograms. Histograms allow the cost-based optimizer to make more informed decisions regarding the distribution of data within a column. They are created using the FOR COLUMNS clause of the ANALYZE command.
  • Partitioning. The partition view and the partitioned table allow subsets of large tables to be processed separately.
  • Parallel DML. DML statements (UPDATE, INSERT, DELETE) can be processed using parallel processing. For DELETE and UPDATE operations, the table involved should be partitioned.
  • Fast full index scan. Fast index scans using multi-block reads and parallel query processing if the index includes all the columns required to satisfy the query.

 

Spotlight on Oracle – Optimizer Hints (SOO2G)

Optimizer hints

Optimizer hints appear as a comment following the first word of the SQL statement (for example, SELECT, INSERT, DELETE, or UPDATE). Hints are differentiated from other comments by the presence of the plus sign (+) following the opening comment delimiter (/*). For instance, the FULL hint in the following example tells the optimizer to perform a full table scan when resolving the query:

SELECT /*+ FULL(E) */ *
FROM EMPLOYEE E
WHERE SALARY > 1000000

The following list shows the hints that can be used: 

HINT DESCRIPTION
ALL_ROWS Use the cost-based optimizer and optimize for the retrieval of all rows.
AND_EQUAL (table_name index_name index_name ….) Retrieve rows from the specified table using each of the specified indexes and merge the results.
APPEND Invokes a direct load insert. Only valid for insert … select from statements.
BITMAP (table_name index_name) Retrieve rows from the specified table using the specified bitmap index.
CACHE (table_name) Encourages rows retrieved by a full table scan to remain in the buffer cache of the SGA.
CHOOSE If statistics have been collected for any table involved in the SQL statement, use cost-based or all-rows optimization, otherwise use rule-based optimization.
CLUSTER (table_name) Uses a cluster scan to retrieve table rows.
DRIVING_SITE (table_name) For a distributed SQL statement, this causes the site at which the specified table resides to be the driving site.
FIRST_ROWS Specifies that the cost-based optimizer should optimize the statement to reduce the cost of retrieving the first row only.
FULL (table_name) Use a full table scan to retrieve rows from the specified table.
HASH (table_name) Use a hash scan to retrieve rows from the specified table. The table must be stored in a hash cluster.
HASH_AJ Perform an anti-join using hash join methodology. This hint must appear after the select statement, not in sub-query.
HASH_SJ Appears within an EXISTS sub-query. Invokes a hash semi-join.
INDEX (table_name [index_name]) Uses the specified index to retrieve rows from the table or, if no index is specified, uses any index.
INDEX_ASC (table_name [index_name]) Specifies an ascending index range scan using the specified index or, if no index is specified, any suitable index.
INDEX_COMBINE (table_name [index_name…]) Instructs the optimizer to combine the specified bitmap indexes. If no bitmap indexes are specified, the optimizer chooses suitable bitmap indexes.
INDEX_DESC (table_name [index_name]) Specifies a descending index range scan using the specified index or, if no index is specified, any suitable index.
INDEX_FFS (table_name [index_name]) Invokes a fast full index scan using the specified index or, if no index is specified, any suitable index. A fast full scan reads the entire index in block order, using multi-block reads and possibly parallel query.
MERGE Instructs the optimizer to perform complex view merging when resolving a query based on a view, or one that includes a sub-query in the WHERE clause.
NO_MERGE Instructs the optimizer not to perform complex view merging when resolving a query based on a view, or one that includes a sub-query in the WHERE clause.
MERGE_AJ Performs an anti-join using sort-merge join method. This hint must appear after the SELECT statement, not in a sub-query.
MERGE_SJ Appears within an EXISTS sub-query. Invokes a sort-merge semi-join.
NO_EXPAND (table_name) Oracle sometimes expands statements with OR conditions into multiple SQL statements combined by a union operation. This hint instructs the optimizer not to do this, even if it calculates that such a transformation would be beneficial.
NO_INDEX (table_name [index_name] ) No index suppresses the use of the named indexes or, if no indexes are specified, all indexes on the named table.
NO_PUSH_PRED Instructs not to push join conditions from the WHERE clause into a view or sub-query.
NOAPPEND Suppresses direct load insert in an INSERT… SELECT FROM… statement.
NOCACHE (table_name) Discourages Oracle from keeping rows retrieved by a full table scan in the buffer cache of the SGA. Overrides the cache setting on the CREATE or ALTER TABLE statement.
NOPARALLEL (table_name) Do not use parallel processing for the SQL statement. Overrides the parallel setting on the CREATE or ALTER TABLE statement.
NOPARALLEL_INDEX (table_name index_name) Suppresses parallelism in fast full index scans or in partitioned index access.
NOREWRITE (Oracle 8i) Prevents the SQL statement from being rewritten to take advantage of materialized views. It overrides the server parameter query_rewrite_enabled.
ORDERED Instructs the optimizer to join the tables in exactly the left to right order specified in the FROM clause.
ORDERED_PREDICATES (Oracle 8i) Causes predicates in the WHERE clause to be evaluated in the order in which they appear in the WHERE clause.
PARALLEL (table_name , degree_of_parallelism) Instructs the optimizer to perform parallel scans on the named table. If no degree of parallelism is specified, the default is used.
PARALLEL_INDEX (table_name [index_name]) Parallelizes a fast full index scan, or an index scan against a partitioned index.
PQ_DISTRIBUTE (table_name outer_distribution inner_distribution) This query determines how a parallel join using table_name is executed. Valid options for outer_distribution and inner_distribution are (not all combinations are valid) hash, broadcast, none, partition.
PUSH_JOIN_PRED/PUSH_PRED Push join conditions from the WHERE clause into a view or sub-query.
PUSH_SUBQ Causes sub-queries to be processed earlier in the execution plan. Normally, sub-queries are processed last, unless the SQL statement is transformed into join.
REWRITE (view_name [view_name…]) (Oracle 8i) Restricts query rewrite to only those materialized views specified in the hint.
ROWID (table_name) Performs a ROWID access.
RULE Uses rule-based optimization.
STAR Considers the STAR join methodology in preference to other methods.
STAR_TRANSFORMATION (Oracle 8.0+) Requests that the star transformation optimization be performed. This transforms a star query into an alternate form that can take advantage of bitmap indexes.
USE_CONCAT Oracle sometimes expands statements with OR conditions into multiple SQL statements combined by union all. This hint instructs the optimizer to do this, even if it calculates that such a transformation would not be beneficial.
USE_HASH (table_name) When joining to this table, use the hash join method.
USE_MERGE (table_name) When joining to this table, use the sort-merge join method.
USE_NL (table_name) When joining to this table, use the nested-loops join method.