Grid queries
If you have a table that is the same on multiple servers in a grid, but whose data is not replicated, you can run a grid query to return the consolidated data from the multiple servers.
For example, suppose that you have a chain of retail stores. Each store has a database with the same schema. The database contains tables for inventory, customer data, and sales transactions. You set up a grid because you want to replicate the inventory tables to a central server. You want the tables for sales transactions to be the same on every server, but you do not want to replicate all the sales transactions to the central server. You do, however, want a monthly report that shows the total sales per store. You run a grid query on the central server that aggregates the sales data for the last month for each store and returns results that are grouped by store.
To run a grid query, you include the GRID clause in the SELECT statement. The GRID clause specifies the grid, or subset of the grid, on which to run the query. The GRID clause has requirements and restrictions for the tables and other SQL constructs that you can include in the query.
Before you can run a grid query, you must define the table that you want to query as a grid table. If you use secure connections between your grid servers, you must configure secure connections on the grid server from which you want to run grid queries.
Planning for grid queries
Consider the following options when you plan grid queries.
Before you run a grid query, you can configure the following options for the queries:
- Whether to run the grid query on all the servers in the grid or a subset of grid servers. To define subsets of grid servers, create regions by running the cdr define region command. You can create as many grid regions as you need. Grid regions can overlap or be divided into smaller grid regions. A grid server can be a member of multiple grid regions.
- Whether to make all SELECT statements that are run in the current session run as grid queries by default. Run the SET ENVIRONMENT SELECT_GRID or the SET ENVIRONMENT SELECT_GRID_ALL statements to specify the grid or region name for every query. Leave the GRID clause out of SELECT statements.
- Whether to skip grid servers that are not available when you run the grid query. By default, the grid query runs only if all servers are available. Run the SET ENVIRONMENT GRID_NODE_SKIP ON statement to run the query on the available servers and skip the unavailable servers.
While you run a grid query, besides choosing the tables and the grid or region to include in the query, you can include the following options:
- Whether to return all qualifying rows, including duplicate rows. By default, grid queries return only unique rows. Include the ALL keyword in the GRID clause to return all rows.
- Whether to return information about which server the results are from. Include the ifx_node_id() or ifx_node_name() function to return a column that identifies the grid server from which each row originates. You can use the server ID or name to group the results.
After you run a grid query, you can find out which servers were skipped for a grid query, if the GRID_NODE_SKIP option was set to ON. Run the ifx_gridquery_skipped_node_count() and ifx_gridquery_skipped_nodes() functions to return the grid servers that were unavailable during the grid query.