Load data from external tables

You can use external tables to load data directly from flat files or pipes to Informix® Warehouse Accelerator.

When you load data directly from external tables to Informix® Warehouse Accelerator, you are spared the interim step of loading data on to Informix® and then transferring to Informix® Warehouse Accelerator. When you have verified that your workload can run entirely on Informix® Warehouse Accelerator, you can create external tables that point to the actual data and load the data directly to Informix® Warehouse Accelerator. After you have created the data mart by using external tables, the queries on this data mart can run only on Informix® Warehouse Accelerator, not Informix®.

Loading data from external tables provides the following benefits:
  • Transfer data from any source across platforms in an ASCII-delimited file to Informix® Warehouse Accelerator.
  • Perform parallel standard INSERT operations.
  • Use named pipes to support loading data from storage devices and direct network connections.
  • Maintain a record of load statistics during the run.
  • Perform high-speed and data-checking data load.
  • Mix external and permanent tables in the data mart.

Query probing with automatic data mart generation is supported. There is minimal Informix® setup and no persistence of data in Informix®.

When loading data from external tables, the following restrictions apply:
  • Locking is not supported and you must specify locking_mode NONE for all load operations, including ifx_loadMart() and ifx_loadPartMart().
  • If the data contains DATE or MONEY data types, the format of the data values of the fields with the DATE type must be specified during external table creation. For example, ... DBDATE 'Y4MD-' ... must used as table option when creating the external table if the external data (flat file or pipe) contains fields of type DATE in the format 2013-06-30.

Joins between tables can be set up among flat files or pipes. For best performance, specify a one-to-many join, which has better performance than any many-to-many join in Informix® Warehouse Accelerator. However, one-to-many joins require that the parent table has a unique index in place on the respective column or columns referred to in the reference definition. The existence of a unique index is verified by the ifx_createMart() stored procedure. If the column values of the parent table are not unique, the accelerator returns incorrect query results. This information can be set by using INDEX DISABLED keywords.

For example, by using flat files:
CREATE EXTERNAL TABLE "informix".nation SAMEAS nation_std 
USING (DATAFILES("disk:/tmp/nation.tbl"),
               NUMROWS "num_rows");
CREATE UNIQUE INDEX nation_pk on nation (n_nationkey) disabled;
Or by using pipes:
CREATE EXTERNAL TABLE "informix".nation SAMEAS nation_std 
USING (DATAFILES("pipe:/tmp/nation.pipe"),
               NUMROWS "num_rows");
CREATE UNIQUE INDEX nation_pk on nation (n_nationkey) disabled;
Important:
  • The pipe must exist before you run the CREATE EXTERNAL TABLE statement.
  • You must feed data into the pipe of the external table while the data is being loaded.
Where num_rows is the approximate number of rows that are contained in the external table. The query probing later uses the number of rows to determine which is the fact table of the probed queries. If you cannot provide num_rows in the external table definition, as an alternative, you can use the ENVIRONMENT Options clause of the SET OPTIMIZATION statement to define a general optimization environment for all queries in the current session. For example:
set environment use_dwa 'probe start';
set explain on avoid_execute;
SET OPTIMIZATION ENVIRONMENT FACT 'ext_d';
select * from ext_f, ext_d where ext_f.f1=ext_d.f1;
set environment use_dwa 'probe stop';

After the external tables are created, you can run query probing, create data marts, and load data.

After initially loading the data mart, you can use the ifx_loadPartMart() function to load more data from external tables into Informix® Warehouse Accelerator. The ifx_loadPartMart() function loads the complete data set, including recently inserted data.
  • If your external table contains the complete table data set, you must replace the original data set with the more recent version. To replace the data set, run ifx_dropPartMart() and then run ifx_loadPartMart(). For example:
    EXECUTE FUNCTION ifx_dropPartMart('demo_dwa','datamart_name','informix',
     'nation', 'nation');
    EXECUTE FUNCTION ifx_loadPartMart('demo_dwa','datamart_name','informix',
     'nation', 'nation');
    
  • If your external table contains only recently inserted data, run ifx_loadPartMart() to load the new data into the data mart alongside the existing data. For example:
    EXECUTE FUNCTION ifx_loadPartMart('demo_dwa','datamart_name','informix',
     'nation', 'nation');