索引質(zhì)量的高低對(duì)數(shù)據(jù)庫(kù)整體性能有著直接的影響。良好高質(zhì)量的索引使得數(shù)據(jù)庫(kù)性能得以數(shù)量級(jí)別的提升,而低效冗余的索引則使得數(shù)據(jù)庫(kù)性能緩慢如牛,即便是使用高檔的硬件配置。因此對(duì)于索引在設(shè)計(jì)之初需要經(jīng)過(guò)反復(fù)的測(cè)試與考量。那對(duì)于已經(jīng)置于生產(chǎn)環(huán)境中的數(shù)據(jù)庫(kù),我們也可以通過(guò)查詢相關(guān)數(shù)據(jù)字典得到索引的質(zhì)量的高低,通過(guò)這個(gè)分析來(lái)指導(dǎo)如何改善索引的性能。下面給出了演示以及索引創(chuàng)建的基本指導(dǎo)原則,最后給出了索引質(zhì)量分析腳本。
1、查看索引質(zhì)量
--獲取指定schema或表上的索引質(zhì)量信息報(bào)告
gx_adm@CABO3> @idx_quality
Enter value for input_owner: GX_ADM
Enter value for input_tbname: CLIENT_TRADE_TBL -->如果我們省略具體的表名則會(huì)輸出整個(gè)schema的索引質(zhì)量報(bào)告
Table Table Index Data Blks Leaf Blks Clust Index
Table Rows Blocks Index Size MB per Key per Key Factor Quality
------------------------- ------------ ---------- ------------------------- ------- --------- --------- ------------ -------------
CLIENT_TRADE_TBL 6,318,035 278488 I_TDCL_ARC_STL_DATE_STOCK 62 312 13 171,017 5-Excellent
I_TDCL_ARC_STL_DATE_CASH 62 318 13 174,599 5-Excellent
I_TDCL_ARC_CANCEL_DATE 83 238 8 288,678 5-Excellent
I_TDCL_ARC_INPUT_DATE 144 249 13 310,974 5-Excellent
I_TDCL_ARC_TRADE_DATE 144 269 14 337,097 5-Excellent
PK_CLIENT_TRADE_TBL 200 1 1 798,216 2-Good
I_TDCL_ARC_GRP_REF_ID 144 1 1 811,468 2-Good
UNI_TDCL_ARC_REF_ID 136 1 1 765,603 2-Good
I_TDCL_ARC_CONTRACT_NUM 72 1 1 834,491 2-Good
I_TDCL_ARC_SETTLED_DATE 61 299 5 380,699 1-Poor
I_TDCL_ARC_ACC_NUM 184 624 3 3,899,446 1-Poor
I_TDCL_ARC_PL_STK 176 218 1 4,348,804 1-Poor
I_TDCL_ARC_INSTRU_ID 120 2,667 8 4,273,038 1-Poor
--從上面的單表輸出的索引質(zhì)量可知,出現(xiàn)了4個(gè)處于Poor級(jí)別的索引,也就是說(shuō)這些個(gè)索引具有較大的聚簇因子,幾乎接近于表上的行了
--對(duì)于這幾個(gè)索引的質(zhì)量還應(yīng)結(jié)合該索引的使用頻率來(lái)考量該索引存在的必要性
--對(duì)于聚簇因子,只能通過(guò)重新組織表上的數(shù)據(jù)來(lái),以及調(diào)整相應(yīng)索引列的順序得以改善
--查詢單表上索引列的相關(guān)信息
gx_adm@CABO3> @idx_info
Enter value for owner: GX_ADM
Enter value for table_name: CLIENT_TRADE_TBL
TABLE_NAME INDEX_NAME CL_NAM CL_POS STATUS IDX_TYP DSCD
------------------------- ------------------------------ -------------------- ------ -------- --------------- ----
CLIENT_TRADE_TBL I_TDCL_ARC_ACC_NUM ACC_NUM 1 VALID NORMAL ASC
I_TDCL_ARC_CANCEL_DATE CANCEL_DATE 1 VALID NORMAL ASC
I_TDCL_ARC_CONTRACT_NUM CONTRACT_NUM 1 VALID NORMAL ASC
I_TDCL_ARC_GRP_REF_ID GRP_REF_ID 1 VALID NORMAL ASC
I_TDCL_ARC_INPUT_DATE INPUT_DATE 1 VALID NORMAL ASC
I_TDCL_ARC_INSTRU_ID INSTRU_ID 1 VALID NORMAL ASC
I_TDCL_ARC_PL_STK STOCK_CD 1 VALID NORMAL ASC
I_TDCL_ARC_PL_STK PL_CD 2 VALID NORMAL ASC
I_TDCL_ARC_SETTLED_DATE SETTLED_DATE 1 VALID NORMAL ASC
I_TDCL_ARC_STL_DATE_CASH STL_DATE_CASH 1 VALID NORMAL ASC
I_TDCL_ARC_STL_DATE_STOCK STL_DATE_STOCK 1 VALID NORMAL ASC
I_TDCL_ARC_TRADE_DATE TRADE_DATE 1 VALID NORMAL ASC
PK_CLIENT_TRADE_TBL BUSINESS_DATE 1 VALID NORMAL ASC
PK_CLIENT_TRADE_TBL REF_ID 2 VALID NORMAL ASC
UNI_TDCL_ARC_REF_ID REF_ID 1 VALID NORMAL ASC
--從上面的查詢結(jié)果可知,當(dāng)前表TRADE_CLIENT_TBL上含有13個(gè)索引,應(yīng)該來(lái)說(shuō)該表索引存在一定冗余。
--大多數(shù)情況下,單表上6-7個(gè)索引是比較理想的。過(guò)多的索引導(dǎo)致過(guò)大的資源開(kāi)銷,以及降低DML性能。
2、索引創(chuàng)建的基本指導(dǎo)原則
索引的創(chuàng)建應(yīng)遵循精而少的原則
收集表上所有查詢的各種不同組合,找出具有最佳離散度的列(或主鍵列等)創(chuàng)建單索引
對(duì)于頻繁讀取而缺乏比較理想離散值的列為其創(chuàng)建組合索引
對(duì)于組合索引應(yīng)考慮下列因素來(lái)制定合理的索引列順序,以下優(yōu)先級(jí)別由高到低來(lái)作為索引的前導(dǎo)列,第二列等等
列被使用的頻率
該列是否經(jīng)常使用“ = ”作為常用查詢條件
列上的離散度
組合列經(jīng)常按何種順序排序
哪些列會(huì)作為附件性列被添加
3、索引質(zhì)量分析腳本
--script name: idx_quality.sql --Author : Leshami --Blog: http://blog.csdn.net/leshami
--index quality retrieval
SET LINESIZE 145
SET PAGESIZE 1000
SET VERIFY OFF
CLEAR COMPUTES
CLEAR BREAKS
BREAK ON table_name ON num_rows ON blocks
COLUMN owner FORMAT a14 HEADING 'Index owner'
COLUMN table_name FORMAT a25 HEADING 'Table'
COLUMN index_name FORMAT a25 HEADING 'Index'
COLUMN num_rows FORMAT 999G999G990 HEADING 'Table|Rows'
COLUMN MB FORMAT 9G990 HEADING 'Index|Size MB'
COLUMN blocks HEADING 'Table|Blocks'
COLUMN num_blocks FORMAT 9G990 HEADING 'Data|Blocks'
COLUMN avg_data_blocks_per_key FORMAT 999G990 HEADING 'Data Blks|per Key'
COLUMN avg_leaf_blocks_per_key FORMAT 999G990 HEADING 'Leaf Blks|per Key'
COLUMN clustering_factor FORMAT 999G999G990 HEADING 'Clust|Factor'
COLUMN Index_Quality FORMAT A13 HEADING 'Index|Quality'
--SPOOL index_quality
SELECT i.table_name,
t.num_rows,
t.blocks,
i.index_name,
o.bytes / 1048576 mb,
i.avg_data_blocks_per_key,
i.avg_leaf_blocks_per_key,
i.clustering_factor,
CASE
WHEN NVL (i.clustering_factor, 0) = 0 THEN '0-No Stats'
WHEN NVL (t.num_rows, 0) = 0 THEN '0-No Stats'
WHEN (ROUND (i.clustering_factor / t.num_rows * 100)) 6 THEN '5-Excellent'
WHEN (ROUND (i.clustering_factor / t.num_rows * 100)) BETWEEN 7 AND 11 THEN '4-Very Good'
WHEN (ROUND (i.clustering_factor / t.num_rows * 100)) BETWEEN 12 AND 15 THEN '2-Good'
WHEN (ROUND (i.clustering_factor / t.num_rows * 100)) BETWEEN 16 AND 25 THEN '2-Fair'
ELSE '1-Poor'
END
index_quality
FROM dba_indexes i, dba_segments o, dba_tables t
WHERE
-- i.index_name LIKE UPPER ('%1%') AND
i.owner = t.owner
AND i.table_name = t.table_name
AND i.owner = o.owner
AND i.index_name = o.segment_name
AND t.owner = UPPER('input_owner')
AND t.table_name LIKE UPPER('%input_tbname%')
ORDER BY table_name,
num_rows,
blocks,
index_quality DESC;
--SPOOL OFF;
===========================================================================================
--script name: idx_info.sql
--get the index column information by specified table
set linesize 180
col cl_nam format a20
col table_name format a25
col cl_pos format 9
col idx_typ format a15
SELECT b.table_name,
a.index_name,
a.column_name cl_nam,
a.column_position cl_pos,
b.status,
b.index_type idx_typ,
a.descend dscd
FROM dba_ind_columns a, dba_indexes b
WHERE a.index_name = b.index_name
AND owner = upper('owner')
AND a.table_name LIKE upper('%table_name%')
ORDER BY 2, 4;