Example of creating a fragmented virtual table
This example shows how to create a fragmented virtual table that is based on a table that contains time series data and that is fragmented by expression.
Prerequisites
Before you run the statements in this example, create three dbspaces named dbs1, dbs2, and dbs3. Otherwise, substitute your dbspace names in the example.
About this example
When you create the stores_demo database, all the setup tasks for creating and loading a time series table are complete. The ts_data table in the stores_demo database contains time series data. The ts_data_v table is a virtual table that is based on the ts_data table. The ts_data table and the ts_data_v virtual table are not fragmented.
In this example, you alter the ts_data table to fragment it by expression into three dbspaces. You re-create the ts_data_v virtual table as a fragmented virtual table. Then, you run queries in parallel against the ts_data_v virtual table.
Preparing for parallel queries
Run the SQL statements in these steps from an SQL editor, such as DB-Access .
To prepare for running parallel queries on the ts_data_v virtual table:
- If necessary, create the stores_demo database by running
the following command:
dbaccessdemo
- If necessary, set the PDQPRIORITY environment
variable to a value other than OFF to enable parallel database queries.
For example, run the following SQL statement:
SET PDQPRIORITY 100
- Specify an explain output file and enable explain output so that
you can determine whether your queries run in parallel by running
the following SQL statements:
SET EXPLAIN FILE TO 'c:/test/parallelvtq.out'; SET EXPLAIN ON;
You can specify a different directory and file name for the explain output file.
- Drop the existing virtual table on time series data, ts_data_v,
in the stores_demo database by running the following SQL statement:
DROP TABLE ts_data_v;
- Alter the ts_data table to fragment it by expression by
running the following SQL statement:
ALTER FRAGMENT ON TABLE ts_data init FRAGMENT BY EXPRESSION PARTITION part1 (loc_esi_id <= "4727354321355594") in dbs1, PARTITION part2 (loc_esi_id <= "4727354321510846") in dbs2, REMAINDER IN dbs3;
- Create the fragmented virtual table, ts_data_v, by running
the following SQL statement:
EXECUTE PROCEDURE TSCreateVirtualTab( 'ts_data_v', 'ts_data', 'origin(2010-11-10 00:00:00.00000),calendar(cal15min), container(raw_container),threshold(0),regular', 'fragment', 'raw_reads' );
The difference between this definition of the ts_data_v virtual table and the original definition is the value of the fourth parameter, the TSVTMode parameter. In this definition, the TSVTMode parameter is set to fragment to fragment the virtual table. In the original definition of the ts_data_v virtual table, the TSVTMode parameter is set to 0, which indicates the default behavior.
- Set the isolation level to DIRTY READ by running the following
SQL statement:
SET ISOLATION TO DIRTY READ;
The DIRTY READ isolation level ensures that a parallel query on a virtual table succeeds if the data in the base table is being modified at the same time.
Run parallel queries
The following SQL statement selects the number of values, the sum of the values, and the average of the values for a 15-minute period on February 7, 2011 for each customer who lives in the state of Arizona:
SELECT state,
COUNT(value) num_values,
SUM(value) sum,
AVG(value) average
FROM ts_data_v v, customer_ts_data l, customer c
WHERE
v.loc_esi_id=l.loc_esi_id AND l.customer_num=c.customer_num
AND state = "AZ"
AND v.tstamp BETWEEN '2011-02-07 23:30:00.00000'
AND '2011-02-07 23:45:00.00000'
GROUP BY state, value
ORDER BY state, value;
The result of the query displays the information for the three qualifying customers:
state (count) (sum) (avg)
AZ 1 0.011 0.011
AZ 1 0.012 0.012
AZ 2 0.050 0.025
The explain output file, parallelvtq.out,
describes the query plan for this query. The information (Parallel,
fragments: ALL)
for the second and third scans indicates
that the scans ran in parallel and accessed all fragments:
...
Estimated Cost: 14
Estimated # of Rows Returned: 1
Maximum Threads: 5
Temporary Files Required For: Order By Group By
1) informix.c: SEQUENTIAL SCAN
Filters: informix.c.state = 'AZ'
2) informix.l: INDEX PATH
(1) Index Keys: customer_num (Parallel, fragments: ALL)
Lower Index Filter: informix.l.customer_num = informix.c.customer_num
NESTED LOOP JOIN
3) informix.v: VTI SCAN (Parallel, fragments: ALL)
VTI Filters: (informix.lessthanorequal(informix.v.tstamp,datetime
(2011-02-07 23:45:00.00000) year to fraction(5) ) AND
informix.greaterthanorequal(informix.v.tstamp,datetime
(2011-02-07 23:30:00.00000) year to fraction(5) ))
Filters: informix.v.loc_esi_id = informix.l.loc_esi_id
NESTED LOOP JOIN
The following SQL statement selects the number of values, the sum of the values, and the average of the values after 11:30 PM on February 7, 2011 for each customer:
SELECT state,
COUNT(value) num_values,
SUM(value) sum,
AVG(value) average
FROM ts_data_v v, customer_ts_data l, customer c
WHERE
v.loc_esi_id=l.loc_esi_id
AND l.customer_num=c.customer_num
AND greaterthan(v.tstamp,'2011-02-07 23:30:00.00000')
GROUP BY state, value
ORDER BY state, value;
The result of the query displays the information for all the qualifying customers:
state (count) (sum) (avg)
AZ 1 0.011 0.011
AZ 1 0.025 0.025
CA 1 0.020 0.02
CA 1 0.023 0.023
CA 1 0.056 0.056
CA 1 0.065 0.065
CA 1 0.071 0.071
CA 1 0.073 0.073
CA 1 0.088 0.088
CA 1 0.162 0.162
CA 1 0.204 0.204
CA 1 0.226 0.226
CA 1 0.246 0.246
CA 1 0.277 0.277
CA 1 0.323 0.323
CA 1 0.340 0.34
CA 1 0.415 0.415
CA 1 0.469 0.469
CA 1 0.670 0.67
CA 1 1.412 1.412
CO 1 0.118 0.118
DE 1 0.256 0.256
FL 1 3.470 3.47
MA 1 4.388 4.388
NJ 2 0.374 0.187
NY 1 0.239 0.239
OK 1 0.086 0.086
The GROUP BY and ORDER BY clauses order the results by state.
The explain output for this query shows that the query ran in parallel.
...
Estimated Cost: 63
Estimated # of Rows Returned: 1
Maximum Threads: 5
Temporary Files Required For: Order By Group By
1) informix.l: SEQUENTIAL SCAN
2) informix.v: VTI SCAN (Parallel, fragments: ALL)
VTI Filters: informix.greaterthan(informix.v.tstamp,datetime
(2011-02-07 23:30:00.00000) year to fraction(5) )
Filters: informix.v.loc_esi_id = informix.l.loc_esi_id
NESTED LOOP JOIN
3) informix.c: INDEX PATH
(1) Index Keys: customer_num (Parallel, fragments: ALL)
Lower Index Filter: informix.l.customer_num = informix.c.customer_num
NESTED LOOP JOIN