Tuesday, August 29, 2006

Oracle For Data Warehouse: Oracle Bitmap Index




1. Structure of A Bitmap Index

* Any type of index is designed to help find rows that contain certain key values. A regular (B*-tree) index does this by pairing the ROWID for a row with its key value, and then sorts those pairs into a B-tree structure. The ROWID serves as a pointer to a row with that value.
* A bitmap index has a very different structure of which no ROWIDs are stored. Bitmap index entries contain bitmap/bit-vector, instead of list of rowids (B*-tree). Each bitmap header stores start and end ROWIDs. Each position in a bitmap maps to a potential row in the table. The contents of that position in the bitmap for a particular value indicate whether that row have that value in the bitmap columns. 1-bit is stored is the value match the bitmap condition; otherwise, 0-bit is stored.
* A mapping function converts the 1-bit position to an actual row. A compression function compresses the long sequence of '0's in the Bitmap.
* When a bitmap index is created, a series of bitmaps are created. Each bitmap corresponding to a particular value of the columns involved, for example, customer_id. If the table contains 300 distinct values for the customer_id, 300 bitmaps will be built. If later an other customer_id is added, a bitmap for that value will be create.
* If a bitmap index involves more that one column, then there will be a bitmap for every possible combination.

2. Characteristics Of A Bitmap Index

* Bitmap indexing is a query execution optimization technique in Data Warehousing environments
* Bitmap indexes are very space efficient, which allow more entries per leaf block. A bitmap can hold many bits pointing to many rowids and have significantly fewer index block processing and disk I/O.
* Bitmap indexes are fully updateable. When a bitmap index entry is locked, many rows stay locked (in OLTP). In a B*-tree index, a list of rowids are stored for each index entry in a leaf block and, when an index entry is locked, it will not impact many table rows. * With bitmap indexes, the optimizer can efficiently answer queries that include AND, OR, or XOR. (Oracle supports dynamic B-tree-to-bitmap conversion, but it can be inefficient.)
* With bitmaps, the optimizer can answer queries when searching or counting for nulls. Null values are also indexed in bitmap indexes (unlike B-tree indexes).
* Most important, bitmap indexes in DSS systems support ad hoc queries, whereas B-tree indexes do not. More specifically, if you have a table with 50 columns and users frequently query on 10 of them-either the combination of all 10 columns or sometimes a single column-creating a B-tree index will be very difficult. If you create 10 bitmap indexes on all these columns, all the queries can be answered by these indexes, whether they are queries on all 10 columns, on 4 or 6 columns out of the 10, or on a single column. The AND_EQUAL hint provides this functionality for B-tree indexes, but no more than five indexes can be used by a query. This limit is not imposed with bitmap indexes.




3. When to use a Bitmap index

* The column has a low distinct cardinality (not just low cardinality), meaning that there are few possible values for the column. For example, sales regions.
* Columns that are frequently used in complex conditions, such as AND, OR or XOR, in the WHERE clause of queries or aggregate queries that contain SUM, COUNT, or other aggregate functions.
* Tables that has many rows. On a table with 1 million rows, even 10,000 distinct possible values would be acceptable.
* Data warehouse that supports many ad hoc queries and few concurrent DML changes.

4. Advantages of Bitmap Index

* Reduced response time for many ad hoc queries
* Substantial reduction of space usage compared to other indexing techniques. Bitmap index can be created individually and then combined efficiently at run time.
* Dramatic performance gains.

5. Star Transformation

* Use Star Transformation in data warehouse that utilize star schema with the following case:
-- Number of dimension tables is large, making the Cartesian product of the dimension tables too expensive.
-- Fact table is sparse
-- Not all dimensions have constraining predicate

* To enable star transformation, set STAR_TRANSFORMATION_ENABLED initialization parameter to TRUE. This parameter is also dynamic and can be set to TRUE at the session level.

6 Index for Dimension and Fact Table

* For Fact table, create bitmap indexes for all the foreign keys. For example, if you fact table contains foreign keys for customer key, sales_person_key, order_key and product_key, which are foreign keys to the dimension, table coustomer_dim, sales_person_dim, order_dim and product_dim. Create a bitmap index for each of these four foreign keys.
* Create an unique b*-tree index on the primary key for each of the dimension tables.



No comments: