The CREATE TABLE statement for the dimension and fact tables
This section includes the CREATE TABLE statements that you use to create the tables of the sales_demo dimensional database.
Referential integrity is, of course, an important
requirement for dimensional databases. However, the following schema
for the sales_demo database does not define the primary and
foreign key relationships that exist between the fact table and its
dimension tables. The schema does not define these primary and foreign
key relationships because data-loading performance improves dramatically
when the database server does not enforce constraint checking. Given
that data warehousing environments often require that tens or hundreds
of gigabytes of data are loaded within a specified time, data-load
performance should be a factor when you decide how to implement a
database in a warehousing environment. Assume that if the sales_demo database
is implemented as a live data mart, some data extraction tool (rather
than the database server) is used to enforce referential integrity
between the fact table and dimension tables.
Tip: After
you create and load a table, you can add primary key and foreign key
constraints to the table with the ALTER TABLE statement to enforce
referential integrity. This method is required only for express load
mode. If the constraints and indexes are necessary and costly to drop
before a load, then deluxe load mode is the best option.
The
following statements create the time, geography, product,
and customer tables. These tables are the dimensions for the sales fact
table. A SERIAL field serves as the primary key for the district_code column
of the geography table.
CREATE TABLE time
(
time_code INT,
order_date DATE,
month_code SMALLINT,
month_name CHAR(10),
quarter_code SMALLINT,
quarter_name CHAR(10),
year INTEGER
);
CREATE TABLE geography
(
district_code SERIAL,
district_name CHAR(15),
state_code CHAR(2),
state_name CHAR(18),
region SMALLINT
);
CREATE TABLE product (
product_code INTEGER,
product_name CHAR(31),
vendor_code CHAR(3),
vendor_name CHAR(15),
product_line_code SMALLINT,
product_line_name CHAR(15)
);
CREATE TABLE customer (
customer_code INTEGER,
customer_name CHAR(31),
company_name CHAR(20)
);
The sales fact table has pointers to each
dimension table. For example, customer_code references the
customer table, district_code references the geography table,
and so forth. The sales table also contains the measures for
the units sold, revenue, cost, and net profit.
CREATE TABLE sales
(
customer_code INTEGER,
district_code SMALLINT,
time_code INTEGER,
product_code INTEGER,
units_sold SMALLINT,
revenue MONEY(8,2),
cost MONEY(8,2),
net_profit MONEY(8,2)
);
Tip: The most useful measures (facts) are
numeric and additive. Because of the great size of databases in data-warehousing
environments, virtually every query against the fact table might require
thousands or millions of records to construct a result set. The only
useful way to compress these records is to aggregate them. In the sales table,
each column for the measures is defined on a numeric data type, so
you can easily build result sets from the units_sold, revenue, cost,
and net_profit columns.
For your convenience, the file called createdw.sql contains all the preceding CREATE TABLE statements.