Indexes for data warehouse applications
Many data warehouse databases use a star schema, which consists of a fact table and a number of dimensional tables. Queries that use tables in a star schema or snowflake schema can benefit from the proper index on the fact table.
The fact table is generally large and contains the quantitative or factual information about the subject. A dimensional table describes an attribute in the fact table.
When a dimension needs lower-level information, the dimension is modeled by a hierarchy of tables, called a snowflake schema.
Consider the example of a star schema with one fact table named orders and four dimensional tables named customers, suppliers, products, and clerks. The orders table describes the details of each sale order, which includes the customer ID, supplier ID, product ID, and sales clerk ID. Each dimensional table describes an ID in detail. The orders table is large, and the four dimensional tables are small.
SELECT sum(orders.price)
FROM orders, customers, suppliers,product,clerks
WHERE orders.custid = customers.custid
AND customers.zipcode = 94025
AND orders.suppid = suppliers.suppid
AND suppliers.name = 'Johnson'
AND orders.prodid = product.prodid
AND product.type = 'hard drive'
AND orders.clerkid = clerks.clerkid
AND clerks.dept = 'Direct Sales'
This query uses a typical star join, in which the fact table joins with all dimensional tables on a foreign key. Each dimensional table has a selective table filter.
CREATE INDEX ON orders(custid,suppid,prodid,clerkid)
Without this index, the optimizer might choose to first join the fact table with a single dimensional table and then join the result with the remaining dimensional tables. The optimal plan provides better performance.
For more information about star schemas and snowflake schemas, see the HCL OneDB™ Database Design and Implementation Guide.