Example: Create a data mart from the command line
Use this step-by-step example as a guide to create, deploy, and load a data mart from the command line using workload analysis.
This example uses the stores_demo database that is created by the command dbaccessdemo.
The workload is from
the following query:
SELECT {+ FACT(orders)} first 5 fname,lname,sum(ship_weight)
FROM customer c,orders o
WHERE c.customer_num=o.customer_num
and state='CA' and ship_date is not null
GROUP BY 1,2
ORDER BY 3 desc;
The query selects the names of
the top five customers from the state CA and the total ship weight
of their already shipped orders. The query is an inner join. The orders table
is the fact table. The customer table is the dimension table.
Since the orders table has fewer rows than the customer table,
the {+ FACT(orders)} optimizer hint is required. Otherwise, the customer table
would be considered as the fact table. The following commands correspond to steps in the task Creating data mart definitions manually by using workload analysis.
The SQL statements used in this example are run in the DB-Access utility, and are prompted by ">". Commands that are run from the shell are prompted by "$".
Step 1: Connect to the database
Connect
to the database. This example uses the stores_demo database:
$ dbaccess stores_demo -
Database selected.
Step 2: Update statistics
Update the statistics
on the database:
> update statistics low;
Statistics updated.
Step 3: Remove probing results:
Remove previously existing probing
results:
> set environment use_dwa 'probe cleanup';
Environment set.
Step 4: Start probing
Enable
query probing:
> set environment use_dwa 'probe start';
Environment set.
Step 5: Optional - Enable SQL tracing
In
a separate session, connect as user
informix
to the sysadmin database
and activate SQL tracing:$ dbaccess sysadmin -
> execute function task("set sql tracing on","1500","4","low", "global");
(expression) SQL Tracing ON: ntraces=1500, size=4056, level=Low, mode=Global.
1 row(s) retrieved.
Step 6: Skip running the query workload
When
you issue this statement, the queries are optimized and probed but
a result set is not determined or returned.
Important: If
you want to process the probing data based on the run time of the
queries, then turn on SQL tracing and do not use the AVOID_EXECUTE
option of the SET EXPLAIN statement. If you avoid running the queries,
you do not know the actual time the queries run.
> set explain on avoid_execute;
Explain set.
Step 7: Run the query workload
Using this
example, the SQL statements are:
> SELECT {+ FACT(orders)} first 5 fname,lname,sum(ship_weight)
FROM customer c,orders o
WHERE c.customer_num=o.customer_num and state='CA'
and ship_date is notnull
GROUP BY by 1,2
ORDER BY by 3 desc;
fname lname (sum)
No rows found.
The
reason no rows are returned in this example is that the SET EXPLAIN
ON AVOID_EXECUTE statement has been used.Step 8: Stop probing
Disable query probing:
> set environment use_dwa 'probe stop';
Environment set.
Step 9: Optional - View the SQL trace information
You can use the onstat command or query the SMI tables to view the SQL trace information.
To use the onstat command:
$ onstat -g his
HCL Informix Dynamic Server Version 11.70.FC3 -- On-Line -- Up 00:53:06 --
182532 Kbytes
Statement history:
Trace Level Low
Trace Mode Global
Number of traces 1500
Current Stmt ID 2
Trace Buffer size 4056
Duration of buffer 42 Seconds
Trace Flags 0x00001611
Control Block 0x4dd51028
Statement # 2: @ 0x4dd51058
Database: 0x100153
Statement text:
select {+ FACT(orders)} first 5 fname,lname,sum(ship_weight)
from customer c,orders o
where c.customer_num=o.customer_num and state='CA' and
ship_date is not null group by 1,2 order by 3 desc
Statement information:
Sess_id User_id Stmt Type Finish Time Run Time TX Stamp PDQ
51 29574 SELECT 10:39:09 0.0000 33f4e 0
Statement Statistics:
Page Buffer Read Buffer Page Buffer Write
Read Read % Cache IDX Read Write Write % Cache
0 0 0.00 0 0 0 0.00
Lock Lock LK Wait Log Num Disk Memory
Requests Waits Time (S) Space Sorts Sorts Sorts
0 0 0.0000 0.000 B 0 0 0
Total Total Avg Max Avg I/O Wait Avg Rows
Executions Time (S) Time (S) Time (S) IO Wait Time (S) Per Sec
1 0.0000 0.0000 0.0000 0.000000 0.000000 678122.8357
Estimated Estimated Actual SQL ISAM Isolation SQL
Cost Rows Rows Error Error Level Memory
10 2 0 0 0 NL 25304
To query the SMI tables to view the SQL trace information:
> SELECT sql_id,sql_runtime, sql_statement
FROM sysmaster:syssqltrace
WHERE ql_stmtname='SELECT'
ORDER BY sql_runtime desc;
sql_id 2
sql_runtime 1.47450285e-06
sql_statement select {+ FACT(orders)} first 5 fname,lname,sum(ship_weight)
from customer c,orders o where c.customer_num=o.customer_num
and state='CA' and ship_date is not null group by 1,2 order by 3 desc
Step 10: Create the Schema database
The stores_demo database
is not a logging database. A different database is required to store
the data mart schema tables generated by running the ifx_probe2mart() stored procedure:
$ dbaccess - -
> create database stores_marts with log;
Database created.
Step 11: Run the ifx_probe2mart() stored procedure
Run the ifx_probe2mart() stored procedure to create the definition
for data mart 'orders_customer_mart':
> execute procedure ifx_probe2mart('stores_demo','orders_customer_mart');
Routine executed.
Step 12: Optional - Remove probing data
Remove probing data gathered
in the warehouse database:
$ dbaccess stores_demo -
> set environment use_dwa 'probe stop';
Environment set.
Step 13: Optional - store the data mart definition in a file
Connect
to the schema database stores_marts, where the data mart definition
that was generated by the ifx_probe2mart() stored procedure is stored:
$ dbaccess stores_marts -
> execute function lotofile(ifx_genmartdef('orders_customer_mart'),
'orders_customer_mart.xml!','client');
(expression) orders_customer_mart.xml
1 row(s) retrieved.
To view the contents of the
data mart definition file:
$ cat orders_customer_mart.xml
<?xml version="1.0" encoding="UTF-8" ?>
<dwa:martModel xmlns:dwa="http://www.ibm.com/xmlns/prod/dwa" version="1.0">
<mart name="orders_customer_mart">
<table name="customer" schema="informix" isFactTable="false" >
<column name="customer_num"/>
<column name="fname"/>
<column name="lname"/>
<column name="state"/>
</table>
<table name="orders" schema="informix" isFactTable="true" >
<column name="customer_num"/>
<column name="ship_date"/>
<column name="ship_weight"/>
</table>
<reference
referenceType="LEFTOUTER"
isRuntimeJoin="true"
parentCardinality="1"
dependentCardinality="n"
dependentTableSchema="informix"
dependentTableName="orders"
parentTableSchema="informix"
parentTableName="customer">
<parentColumn name="customer_num"/>
<dependentColumn name="customer_num"/>
</reference>
</mart>
</dwa:martModel>
Step 14: Deploy the data mart
Create the data mart orders_customer_mart on
your accelerator stores_accelerator, by using schema database stores_marts:
$ dbaccess stores_demo -
> EXECUTE FUNCTION ifx_createMart('stores_accelerator', 'orders_customer_mart',
'stores_marts');
(expression) The operation was completed successfully.
1 row(s) retrieved.
Step 15: Load the data mart
Load the data mart orders_customer_mart on
your accelerator stores_accelerator. Tables in database stores_demo do
not become locked:
> EXECUTE FUNCTION ifx_loadMart('stores_accelerator', 'orders_customer_mart',
'NONE');
(expression) The operation was completed successfully.
1 row(s) retrieved.