Load data into the dimensional database
An important step when you implement a dimensional database
is to develop and document a load strategy. This section shows the
LOAD and INSERT statements that you can use to populate the tables
of the sales_demo database.
Tip: In a live data
warehousing environment, you typically do not use the LOAD or INSERT
statements to load large amounts of data to and from HCL®
OneDB® databases.
HCL OneDB database servers provide different features for loading and unloading of data.
For information about loading, see your HCL OneDB Administrator's Guide.
The
following statement loads the time table with data first so
that you can use it to determine the time code for each row that is
loaded into the sales table:
LOAD FROM 'time.unl' INSERT INTO time
The
following statement loads the geography table. After you load
the geography table, you can use the district code data to
load the sales table.
INSERT INTO geography(district_name, state_code, state_name)
SELECT DISTINCT c.city, s.code, s.sname
FROM stores_demo:customer c, stores_demo:state s
WHERE c.state = s.code
The following statements
add the region code to the geography table:
UPDATE geography
SET region = 1
WHERE state_code = 'CA'
UPDATE geography
SET region = 2
WHERE state_code <> 'CA'
The following statement
loads the customer table:
INSERT INTO customer (customer_code, customer_name, company_name)
SELECT c.customer_num, trim(c.fname) ||' '|| c.lname, c.company
FROM stores_demo:customer c
The following statement
loads the product table:
INSERT INTO product (product_code, product_name, vendor_code,
vendor_name,product_line_code, product_line_name)
SELECT a.catalog_num,
trim(m.manu_name)||' '||s.description,
m.manu_code, m.manu_name,
s.stock_num, s.description
FROM stores_demo:catalog a, stores_demo:manufact m,
stores_demo:stock s
WHERE a.stock_num = s.stock_num
AND a.manu_code = s.manu_code
AND s.manu_code = m.manu_code;
The following
statement loads the sales fact table with one row for each
product, per customer, per day, per district. The cost from the cost table
is used to calculate the total cost (cost * quantity).
INSERT INTO sales (customer_code, district_code, time_code,
product_code, units_sold, cost, revenue, net_profit)
SELECT
c.customer_num, g.district_code, t.time_code,
p.product_code, SUM(i.quantity),
SUM(i.quantity * x.cost), SUM(i.total_price),
SUM(i.total_price) - SUM(i.quantity * x.cost)
FROM stores_demo:customer c, geography g, time t,
product p,stores_demo:items i,
stores_demo:orders o, cost x
WHERE c.customer_num = o.customer_num
AND o.order_num = i.order_num
AND p.product_line_code = i.stock_num
AND p.vendor_code = i.manu_code
AND t.order_date = o.order_date
AND p.product_code = x.product_code
AND c.city = g.district_name
GROUP BY 1,2,3,4;