- Normally in a btree, there is a one-to-one relationship between an index entry and a row: an entry points to a row. Indexes should be selective in general.
- With Bitmap Indexes, a single entry uses a bitmap to point to many rows simultaneously. Bitmap indexes should not be selective.
- Values that have fewer than three distinct values can see dramatic performance improvements by utilizing the bitmapped index technique.
- Bitmap indexes provide set-based processing within the database, allowing to use very fast methods for set operations such as AND, OR, MINUS and COUNT.
When should you use a Bitmap index
They are :
- appropriate for highly repetitive data (data with a few distinct values relative to the total number of rows in the table) that is mostly read-only.
- inappropriate for OLTP system cause of the concurrency-related issue (where data is frequently updated by many concurrent sessions)
Consider : a column that takes on
three possible value Y, N, and null in a table of 1 million rows.
This might be a good candidate for a
bitmap index if, for example, you need to frequently count how many rows have a
value of Y.
That is not to say that a bitmap
index on a column with 1,000 distinct values in that same table would not be
valid. It certainly can be !
Bitmap indexes are most appropriate
on low distinct cardinality data, i.e. data with relatively few discrete
values when compared to the cardinality of the entire set.
Example :
Number
of records in the resultset
|
Distinct
Number
|
Percent
|
Low
Cardinality
|
100
|
2
|
0.02
|
YES
|
2
|
2
|
1.00
|
NO
|
10,000,000
|
100,000
|
0.01
|
YES
|
The last example probably would not
be candidates for a having btree indexes, as each of the values would tend to retrieve an
extremely large percentage of the table.
Representation
nico@ORCL>CREATE
BITMAP INDEX job_idx ON emp(job);
INDEX
created.
BITMAP
Value/Row
|
01
|
02
|
03
|
04
|
05
|
06
|
07
|
08
|
09
|
10
|
11
|
12
|
13
|
14
|
ANALYST
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
1
|
0
|
1
|
0
|
0
|
1
|
0
|
CLERK
|
1
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
1
|
1
|
0
|
1
|
MANAGER
|
0
|
0
|
0
|
1
|
0
|
1
|
1
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
PRESIDENT
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
1
|
0
|
0
|
0
|
0
|
0
|
SALESMAN
|
0
|
1
|
1
|
0
|
1
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
We can see that :
- rows 8, 10 and 13 have the value ANALYST
- rows 4, 6 and 7 have the value MANAGER
- ….
- no rows are null (bitmap index store null entries)
If we wanted to count the rows that
have the value MANAGER, the bitmap index would do this very rapidly.
nico@ORCL>SELECT
COUNT(*) FROM emp;
Execution
Plan
---------------------------------------------------------------------------------
|
Id | Operation | Name | ROWS
| Cost (%CPU)| TIME |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
1 | 1 (0)| 00:00:01 |
| 1 |
SORT AGGREGATE
| | 1 | | |
| 2 |
BITMAP CONVERSION COUNT | |
14 | 1 (0)| 00:00:01 |
| 3 |
BITMAP INDEX FAST FULL SCAN| JOB_IDX | | | |
---------------------------------------------------------------------------------
Bitwise OR
Even if we wanted to find all the
rows such that the JOB was CLERK or MANAGER, we would simply combine their
bitmaps from the index.
Value/Row
|
01
|
02
|
03
|
04
|
05
|
06
|
07
|
08
|
09
|
10
|
11
|
12
|
13
|
14
|
ANALYST
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
1
|
0
|
1
|
0
|
0
|
1
|
0
|
CLERK
|
1
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
1
|
1
|
0
|
1
|
CLERK or MANAGER
|
1
|
0
|
0
|
1
|
0
|
1
|
1
|
0
|
0
|
0
|
1
|
1
|
0
|
1
|
nico@ORCL>SELECT
COUNT(*) FROM emp WHERE job = 'CLERK' OR job = 'MANAGER';
Execution
Plan
-----------------------------------------------------------------------------------------
|
Id | Operation | Name | ROWS
| Bytes | Cost (%CPU)| TIME |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
1 | 6 | 1
(0)| 00:00:01 |
| 1 |
SORT AGGREGATE
| | 1 |
6 | | |
| 2 |
BITMAP CONVERSION COUNT | |
7 | 42 | 1
(0)| 00:00:01 |
|* 3 |
BITMAP INDEX FAST FULL SCAN| JOB_IDX | |
| | |
-----------------------------------------------------------------------------------------
BITMAP CONVERSION TO ROWIDS
nico@ORCL>SELECT
* FROM emp WHERE job = 'CLERK' OR job = 'MANAGER';
Execution
Plan
-----------------------------------------------------------------------------------------
|
Id | Operation | Name | ROWS
| Bytes | Cost (%CPU)| TIME |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
7 | 609 | 2
(0)| 00:00:01 |
| 1 |
INLIST ITERATOR
| | |
| | |
| 2 |
TABLE ACCESS BY INDEX ROWID | EMP
| 7 | 609 |
2 (0)| 00:00:01 |
| 3 |
BITMAP CONVERSION TO ROWIDS|
| | | | |
<-------- Here
|* 4 |
BITMAP INDEX SINGLE VALUE | JOB_IDX | |
| | |
-----------------------------------------------------------------------------------------
We need for this query to get to the
table. Here, Oracle will apply a function to turn the fact that i'th bit is on
in a bitmap, into a rowid that can be used to access the table.
Ad hoc query
Bitmap indexes are useful when you
have query :
- that reference many columns in an ad hoc fashion
- produce aggregation such as COUNT
SELECT
COUNT(*)
FROM T
WHERE
gender = 'M'
AND location IN ( 1, 10, 30 )
AND age_group = '41 and over';
SELECT
*
FROM T
WHERE
( ( gender = 'M' AND location = 20 )
OR
( gender = 'F' AND location = 22 ) )
AND
age_group = '18 and under';
SELECT
COUNT(*) FROM t WHERE location IN (11,20,30);
SELECT
COUNT(*) FROM t WHERE age_group = '41' AND gender = 'F';
With btree, you will need large concatenated btree indexes on :
- GENDER, LOCATION, AGE_GROUP for queries that use all three, or GENDER WITH LOCATION, or GENDER alone
- LOCATION, AGE_GROUP for queries that use LOCATION and AGE_GROUP or LOCATION alone
- …
To reduce the amount of data being
searched, other permutations might be reasonable as well to decrease the size
of the index structure being scanned but this is ignoring the fact that a btree index on such low cardinality data is not a good idea.
Bitmap AND OR and NOT
Above the bitmap index comes into
play. With three small indexes, one on each of the individual columns, you will
be able to satisfy all the previous predicates with the use of functions as :
- AND
- OR
- and NOT
nico@ORCL>SELECT
*
2
FROM t
3
WHERE ( ( gender = 'M' AND location = 20 )
4
OR ( gender = 'F' AND location = 22 ))
5
AND age_group = '18 and under';
Execution
Plan
------------------------------------------------------------------------------------------------
|
Id | Operation | Name | ROWS | Bytes | Cost (%CPU)| TIME |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 501 | 16533 | 79
(0)| 00:00:01 |
| 1 |
TABLE ACCESS BY INDEX ROWID |
T | 501 | 16533 | 79
(0)| 00:00:01 |
| 2 |
BITMAP CONVERSION TO ROWIDS
| | |
| | |
| 3 |
BITMAP AND | | |
| | | <-----Here, BITMAP AND
|* 4 |
BITMAP INDEX SINGLE VALUE |
AGE_GROUP_IDX | | | | |
| 5 |
BITMAP OR | | |
| |
| <-----Here, BITMAP OR
| 6 |
BITMAP AND | | |
| | | <-----Here, BITMAP AND
|* 7 |
BITMAP INDEX SINGLE VALUE| LOCATION_IDX
| | | | |
|* 8 |
BITMAP INDEX SINGLE VALUE| GENDER_IDX
| | | | |
| 9 |
BITMAP AND | | |
| | | <-----Here, BITMAP AND
|*
10 | BITMAP INDEX SINGLE VALUE|
GENDER_IDX | |
| | |
|*
11 | BITMAP INDEX SINGLE VALUE|
LOCATION_IDX | |
| | |
------------------------------------------------------------------------------------------------
Not suitable for write-intensive Environment
The reason is that a single bitmap
index key entry points to many rows.
If a session modifies the indexed
data, then all of the rows that index entry points are effectively locked in
most case. Oracle cannot lock an individual bit in a bitmap index entry; it
locks the entire bitmap index entry. Any other modifications that need to update
the same bitmap index entry will be locked out.
Compressed indexes, like bitmap indexes, represent a trade-off between CPU usage and disk space usage. A compressed structure is faster to read from disk but takes additional CPU cycles to decompress for access - an uncompressed structure imposes a lower CPU load but requires more bandwidth to read in a short time.
One belief concerning bitmap indexes is that they are only suitable for indexing low-cardinality data. This is not necessarily true, and bitmap indexes can be used very successfully for indexing columns with many thousands of different values.
The reason for confining bitmap indexes to data warehouses is that the overhead on maintaining them is enormous. A modification to a bitmap index requires a great deal more work on behalf of the system than a modification to a b-tree index. In addition, the concurrency for modifications on bitmap indexes is dreadful.
Bitmap indexes are not appropriate for tables that have lots of single row DML operations (inserts) and especially concurrent single row DML operations. Deadlock situations are the result of concurrent inserts
Advantage of Bitmap Indexes
The advantages of them are that they have a highly compressed structure, making them fast to read and their structure makes it possible for the system to combine multiple indexes together for fast access to the underlying table.Compressed indexes, like bitmap indexes, represent a trade-off between CPU usage and disk space usage. A compressed structure is faster to read from disk but takes additional CPU cycles to decompress for access - an uncompressed structure imposes a lower CPU load but requires more bandwidth to read in a short time.
One belief concerning bitmap indexes is that they are only suitable for indexing low-cardinality data. This is not necessarily true, and bitmap indexes can be used very successfully for indexing columns with many thousands of different values.
Bitmap indexes are best suited for
DSS regardless of cardinality for these reasons:
- With bitmap indexes, the optimizer can efficiently answer queries that include AND, OR, or XOR. (Oracle supports dynamic B-tree-to-bitmap conversion, but it can be inefficient.)
- With bitmaps, the optimizer can answer queries when searching or counting for nulls. Null values are also indexed in bitmap indexes (unlike B-tree indexes).
- Most important, bitmap indexes in DSS systems support ad hoc queries, whereas B-tree indexes do not. More specifically, if you have a table with 50 columns and users frequently query on 10 of themeither the combination of all 10 columns or sometimes a single columncreating a B-tree index will be very difficult. If you create 10 bitmap indexes on all these columns, all the queries can be answered by these indexes, whether they are queries on all 10 columns, on 4 or 6 columns out of the 10, or on a single column. The AND_EQUAL hint provides this functionality for B-tree indexes, but no more than five indexes can be used by a query. This limit is not imposed with bitmap indexes.
Disadvantage of Bitmap Indexes
There are several disadvantages to using a bitmap index on a unique column--one being the need for sufficient space (and Oracle does not recommend it). However, the size of the bitmap index depends on the cardinality of the column on which it is created as well as the data distribution.The reason for confining bitmap indexes to data warehouses is that the overhead on maintaining them is enormous. A modification to a bitmap index requires a great deal more work on behalf of the system than a modification to a b-tree index. In addition, the concurrency for modifications on bitmap indexes is dreadful.
Bitmap indexes are not appropriate for tables that have lots of single row DML operations (inserts) and especially concurrent single row DML operations. Deadlock situations are the result of concurrent inserts
Note:
1. Bitmap indexes include rows that have NULL values, So, Index scan will happen.
Null values are also indexed in bitmap indexes (unlike B-tree indexes). There is no any difference for null and not null values for them
2. b-tree index does not contain info about null keys.
create table X (i integer, j integer);
create index X_j on X(j);
select rowid from X; -- index cannot be used
select rowid from X where j = 1; -- index can be used
select rowid from X where j is not null; -- index can be used
select rowid from X where j is null; -- index cannot be used
No comments:
Post a Comment