REST API Examples: TimeSeries Queries

The REST API provides a rich set of options for querying TimeSeries data in HCL OneDB™. You can retrieve the TimeSeries elements from the database exactly as they are stored, you can query for data between start and end timestamps, you can aggregate TimeSeries data both across different elements and across different TimeSeries objects, and you can run custom arthimetic expressions against your TimeSeries elements.

The REST API server supports both BSON TimeSeries row types and custom TimeSeries row types. A BSON TimeSeries row type includes a timestamp and a BSON field. The BSON field provides a flexible schema to hold any fields you need for your TimeSeries data, both now and in the future. BSON TimeSeries tables can be created automatically through the REST API. You can also create a custom TimeSeries row type through SQL that defines the particular fields and data types that make up your TimeSeries elements. Custom row types do not have the same flexibility as BSON row types, but can be customized to your exact data structure. Whichever route you choose for structuring your TimeSeries data, the REST API will detect your TimeSeries schema and automatically write the SQL for your REST queries accordingly.

The examples in this topic focus exclusively on querying TimeSeries data. For examples on creating, inserting, loading, and updating TimeSeries data, or on working with TimeSeries calendars, see REST API Examples: TimeSeries Tables and Calendars.

This topic will demonstrate various queries against the ts_data table in the stores_demo database. If you would like to run these same queries against your REST API server, refer to the Demonstration databases topic for information on how to create the stores_demo database.

Note: When querying a TimeSeries table, you can POST to either the /api/servers/{alias}/databases/{dbname}/tables/{tabname}/query path or the /api/servers/{alias}/databases/{dbname}/timeseries/tables/{tabname}/query path. Both APIs allow you to query your TimeSeries tables and both support the same options. The examples below use the timeseries/tables/{tabname}/query path; but the exact same examples could be run against the tables/{tabname}/query path too.

Example 1: Query for TimeSeries data
Example 2: Query for TimeSeries data with a TimeSeries filter
Example 3: Query for TimeSeries data with limit and skip
Example 4: Query for TimeSeries data with start and end timestamps
Example 5: Query for TimeSeries data with start and end timestamps (pipeline version)
Example 6: Query for aggregated TimeSeries data
Example 7: Query for running aggregates of TimeSeries data
Example 8: Query for TimeSeries data, applying an arithmetic expression to each element
Example 9: Query for TimeSeries data to rollup elements across multiple TimeSeries
Example 10: Query for TimeSeries data by linking multiple pipeline operators to create complex query conditions
Example 11: Query for the first or last TimeSeries element
Example 12: Query for the count of TimeSeries elements

Example 1: Query for TimeSeries data

You query a TimeSeries table the same way you query any other relational table. POST a JSON document to the query url for the table where the contents of the request body defines the parameters of your query. The supported query options are shown in the following table:

Table 1. Query options
Key	Description
fields	A list of column names to include in the query results, for example `["loc_esi_id", "measure_unit", "direction" ,"raw_reads"]`.
filter	A query filter document that defines either a comparison query filter or a logical query filter. For more information, see REST Relational Table Example: Querying a table with a query filter. Note: This filter field affects the rows returned by the query. The timeseriesFilter field below affects the TimeSeries elements within those rows that are returned by the query.
timeseriesFilter	A filter that specific to your TimeSeries column. This can include start and/or end timestamps, a query pipeline, a transform function, and limit and skip properties specific to iterating through your TimeSeries elements. See the table below for more information on the timeseriesFilter. Note: This timeseriesFilter field affects the TimeSeries elements returned by the query. By constrast, the filter field above affects which rows are returned by the query.
limit	The maximum number of rows to return from a SQL query. Keep in mind that this is the limit on the rows in the table returned by the query, which is different from the number of TimeSeries elements included in each result row. The REST API server's default limit is 100 rows, but this can be configured in the REST configuration file. You can also override it on a per query basis by setting this limit property in your query request body.
skip	The number of rows to skip. Keep in mind that this is the skip value on the rows in the table returned by the query, which is different from the skip value applied to the TimeSeries elements included in each result row.
orderBy	A list of documents defining the columns to orderBy. Each document in the list must include key and direction properties to indicate which column name and which direction ("asc" for ascending or "desc" for descending) to order the results by. See REST Relational Table Examples: Query a table using projection and order by fields for more information.

The following REST example queries for the loc_esi_id, measure_unit, direction, and raw_reads columns of the ts_data table where loc_esi_id equals "4727354321000111". Because there is no timeseriesFilter in this query request body, this query will return the raw_reads TimeSeries elements exactly they are stored in the database for the matching row.

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["loc_esi_id", "measure_unit", "direction" ,"raw_reads"],
    "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" }
}

Response

 {
    "results": [
        {
            "loc_esi_id": "4727354321000111",
            "measure_unit": "KWH",
            "direction": "P",
            "raw_reads": {
                "type": "regular",
                "origin": { "$date": "2010-11-10T06:00:00Z" },
                "pattern": {
                    "intervals": [
                        { "duration": 1, "type": "on" },
                        { "duration": 14, "type": "off" }
                    ],
                    "unit": "minute"
                },
                "elements": [
                    {
                        "tstamp": { "$date": "2010-11-10T06:00:00Z" },
                        "value": { "$numberDecimal": "0.092" }
                    },
                    {
                        "tstamp": { "$date": "2010-11-10T06:15:00Z" },
                        "value": { "$numberDecimal": "0.084" }
                    },
                    {
                        "tstamp": { "$date": "2010-11-10T06:30:00Z" },
                        "value": { "$numberDecimal": "0.09" }
                    },
                    ...
                ],
                "elementsTruncated": true
            }
        }
    ],
    "hasMore": false,
    "responseTime": 194
}

Note: In this query result, in the JSON that represents the raw_reads TimeSeries object, you will notice that there are a few metadata properties about the TimeSeries object in addition to the TimeSeries data elements themselves. Every time a TimeSeries object is returned by the REST API, it will always include a type ("regular" or "irregular") property and an elements property holding the actual time series data. If it is a regular TimeSeries, meaning there is a fixed calendar interval for the time series data, there will also be origin and pattern properties describing the origin date and calendar interval for that time series data. If it is an irregular TimeSeries object, then no additional properties will be provided beyond the type and the data elements.

Example 2: Query for TimeSeries data with a TimeSeries filter

Use the timeseriesFilter property to filter or manipulate the TimeSeries data elements returned by your query. The following table describes the options you can specify in the timeseriesFilter document.

Table 2. timeseriesFilter query options
Property	Description
field	The name of the TimeSeries column name that this filter applies to. This property is required if there is more than one TimeSeries column in the table. It is optional if there is only one TimeSeries column in the table.
start	A start timestamp for the TimeSeries elements to return. If not set, the query will return elements starting the earliest timestamp in the TimeSeries.
end	An end timestamp for the TimeSeries elements to return. If not set, the query will return elements up to the latest timestamp in the TimeSeries.
pipeline	A query pipeline for the TimeSeries table. The query pipeline allows to you aggregate or transform your TimeSeries data in a particular way. Query pipeline stages include aggregate, running_aggregate, clip, apply, and rollup. Examples of each of these are shown later on in this tutorial. Query pipeline stages can be chained together in a list, allowing you to aggregate your TimeSeries data in complex ways.
transform	An option for transforming an array of TimeSeries elements into a single value or into a single element. If a pipeline is also specified, the transform is applied after the query pipeline. Transform options include count, which allows you to get a count of all elements or of all elements that match a condition, and first and last which allows you to return only the first or last element of the given TimeSeries query.
limit	The maximum number of TimeSeries elements included in each TimeSeries object returned by the query. The REST API server's default limit is 100 TimeSeries elements, but this can be configured in the REST configuration file. You can also override it on a per query basis by setting this limit within the timeseriesFilterproperty in your query request body.
skip	The number of TimeSeries elements to skip.

Here is an example using the field, start, and end options of the timeseriesFilter.

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["loc_esi_id", "raw_reads"],
    "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" },
    "timeseriesFilter": {
        "field": "raw_reads",
        "start": { "$date": "2010-12-01T11:00:00Z" },
        "end": { "$date": "2010-12-01T12:00:00Z" }
    }
}

Response

{
    "results": [
        {
            "loc_esi_id": "4727354321000111",
            "raw_reads": {
                "type": "regular",
                "origin": { "$date": "2010-12-01T11:00:00Z" },
                "pattern": {
                    "intervals": [
                        { "duration": 1, "type": "on" },
                        { "duration": 14, "type": "off" }
                    ],
                    "unit": "minute"
                },
                "elements": [
                    {
                        "tstamp": { "$date": "2010-12-01T11:00:00Z" },
                        "value": { "$numberDecimal": "0.072" }
                    },
                    {
                        "tstamp": { "$date": "2010-12-01T11:15:00Z" },
                        "value": { "$numberDecimal": "0.163" }
                    },
                    {
                        "tstamp": { "$date": "2010-12-01T11:30:00Z" },
                        "value": { "$numberDecimal": "0.178" }
                    },
                    {
                        "tstamp": { "$date": "2010-12-01T11:45:00Z" },
                        "value": { "$numberDecimal": "0.192" }
                    },
                    {
                        "tstamp": { "$date": "2010-12-01T12:00:00Z" },
                        "value": { "$numberDecimal": "0.245" }
                    }
                ],
                "elementsTruncated": false
            }
        }
    ],
    "hasMore": false,
    "responseTime": 19
}

Example 3: Query for TimeSeries data with limit and skip

There are two types of limit's and skip's that you can set when querying TimeSeries data. The first is a limit and skip on the rows in the table that match the provided filter. This limit and skip works just like REST Relational Table Examples: Query a table using limit and skip. In the response, you will have a hasMore property that tells you if there are more rows remaining in your query that you can retrieve by reissuing the same query with a increased skip value.

The second limit and skip is the one for the TimeSeries elements. When you query a TimeSeries table, each row in the table has at least one column that contains a TimeSeries object. Each of those individual TimeSeries objects could contain hundreds, thousands, millions, or more individual TimeSeries elements. The limit property that you set in the timeseriesFilter sets a maximum on the number of elements you can receive in a single request and the skip property in the timeseriesFilter allows you to page through batches of TimeSeries elements across multiple requests.

To determine if there are more TimeSeries elements to retrieve, check the elementsTruncated field. If you scroll up, you will notice that in Example 2, the elementsTruncated field is false, which means that all TimeSeries elements that matched the query were included in the HTTP response. However, if you scroll even farther up to Example 1, you will notice that the elementsTruncated field is true. That query had no filters or start/end timestamp bounds on the TimeSeries data in the raw_reads column. Since there is a lot of data in that TimeSeries, the default limit of 100 elements was exceeded. As a result, the HTTP response included only the first 100 TimeSeries elements and had its elementsTruncated field set to true. To retrieve the next batch of results just add the skip field in the timeseriesFilter and reissue the REST query as shown here.

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["loc_esi_id", "measure_unit", "direction" ,"raw_reads"],
    "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" },
    "timeseriesFilter": {
        "skip": 100,
        "limit": 100
    }
}

Response

{
    "results": [
        {
            "loc_esi_id": "4727354321000111",
            "measure_unit": "KWH",
            "direction": "P",
            "raw_reads": {
                "type": "regular",
                "origin": {
                    "$date": "2010-11-10T06:00:00Z"
                },
                "pattern": {
                    "intervals": [
                        { "duration": 1, "type": "on" },
                        { "duration": 14, "type": "off" }
                    ],
                    "unit": "minute"
                },
                "elements": [
                    {
                        "tstamp": { "$date": "2010-11-11T07:00:00Z" },
                        "value": { "$numberDecimal": "0.013" }
                    },
                    {
                        "tstamp": { "$date": "2010-11-11T07:15:00Z" },
                        "value": { "$numberDecimal": "0.012" }
                    },
                    {
                        "tstamp": { "$date": "2010-11-11T07:30:00Z" },
                        "value": { "$numberDecimal": "0.013" }
                    },
                    {
                        "tstamp": { "$date": "2010-11-11T07:45:00Z" },
                        "value": { "$numberDecimal": "0.012" }
                    },
                    ...
                ],
                "elementsTruncated": true
            }
        }
    ],
    "hasMore": false,
    "responseTime": 368
}

Notice again in the response that elementsTruncated is still true. You can continue to iterate through batches of elements by increasing the skip each time until the last batch has been received as indicated by elementsTruncated=false.

Example 4: Query for TimeSeries data with start and end timestamps

Oftentimes in TimeSeries queries, you will want to query within certain timestamps. The easiest way to limit your TimeSeries query to a specific time range is to add start and/or end timestamps to the query. These start and end timestamps are inclusive. You can provide one, or the other, or both timestamps. In absence of a start timestamp, the query data will start at the earliest timestamp in TimeSeries object. In absence of an end timestamp, the query data will end at the latest timestamp in TimeSeries object.

The following example shows a query of our ts_data table for all TimeSeries entries between 10:00am and 10:30am on November 15, 2010. As raw_reads data in the table is based on a 15 minute calendar interval, this query returns three TimeSeries elements between these timestamps since the timestamps are inclusive.

Note: Take note of the inclusion of field property inside the timeseriesFilter. Since the ts_data table has two TimeSeries columns, we need to specify the TimeSeries column that the timeseriesFilter with its start and end timestamps applies to. If the table queried only had one TimeSeries column, the field could be omitted from the timeseriesFilter document.

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["loc_esi_id", "raw_reads"],
    "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" },
    "timeseriesFilter": {
        "field": "raw_reads",
        "start": { "$date": "2010-11-15T10:00:00Z" },
        "end": { "$date": "2010-11-15T10:30:00Z" }
    }
}

Response

{
    "results": [
        {
            "loc_esi_id": "4727354321000111",
            "raw_reads": {
                "type": "regular",
                "origin": { "$date": "2010-11-15T10:00:00Z" },
                "pattern": {
                    "intervals": [
                        { "duration": 1, "type": "on" },
                        { "duration": 14, "type": "off" }
                    ],
                    "unit": "minute"
                },
                "elements": [
                    {
                        "tstamp": { "$date": "2010-11-15T10:00:00Z" },
                        "value": { "$numberDecimal": "0.132" }
                    },
                    {
                        "tstamp": { "$date": "2010-11-15T10:15:00Z" },
                        "value": { "$numberDecimal": "0.172" }
                    },
                    {
                        "tstamp": { "$date": "2010-11-15T10:30:00Z" },
                        "value": { "$numberDecimal": "0.134" }
                    }
                ],
                "elementsTruncated": false
            }
        }
    ],
    "hasMore": false,
    "responseTime": 146
}

Note: A few different date formats are accepted by the OneDB REST API. You can specify a date in the following ways:

Using the MongoDB Extended JSON v2 format where the date is specified inside of a "$date" field either as {"start": { "$date": "2010-11-15T10:00:00Z" } } or { "start": { "$date": 1289815200000 } }. This is the format that the REST API will use when returning dates in HTTP responses.
You can skip the "$date" and just specify the number of milliseconds from the Unix epoch: { "start": 1289815200000 }
You can also specify the date as a string (but still without the "$date") using either one of these two formats: { "start": "2010-11-15 10:00:00.000" } or { "start": "2010-11-15T10:30:00.000Z" }.

Example 5: Query for TimeSeries data with start and end timestamps (pipeline version)

When querying TimeSeries data, you can provide a pipeline that will filter your TimeSeries data through one or more query pipeline operators. This and the next five examples will show various options for TimeSeries pipeline queries.

One TimeSeries query pipeline operator is the clip operator. This operator clips the TimeSeries data between a start and/or end timestamp. Therefore we can write the exact same query as Example 4 in the following way using a clip stage in the pipeline.

To specify a clip stage, you must include a start and/or end timestamp. Plus you can optionally include a flags property as defined for the Clip function in OneDB.

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["loc_esi_id", "raw_reads"],
    "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" },
    "timeseriesFilter": {
        "field": "raw_reads",
        "pipeline": [ 
                { "clip": { "start": { "$date": "2010-11-15T10:00:00Z" }, "end": { "$date": "2010-11-15T10:30:00Z" } } } 
        ]
    }
}

Response

{
    "results": [
        {
            "loc_esi_id": "4727354321000111",
            "raw_reads": {
                "type": "regular",
                "origin": { "$date": "2010-11-15T10:00:00Z" },
                "pattern": {
                    "intervals": [
                        { "duration": 1, "type": "on" },
                        { "duration": 14, "type": "off" }
                    ],
                    "unit": "minute"
                },
                "elements": [
                    {
                        "tstamp": { "$date": "2010-11-15T10:00:00Z" },
                        "value": { "$numberDecimal": "0.132" }
                    },
                    {
                        "tstamp": { "$date": "2010-11-15T10:15:00Z" },
                        "value": { "$numberDecimal": "0.172" }
                    },
                    {
                        "tstamp": { "$date": "2010-11-15T10:30:00Z" },
                        "value": { "$numberDecimal": "0.134" }
                    }
                ],
                "elementsTruncated": false
            }
        }
    ],
    "hasMore": false,
    "responseTime": 18
}

Example 6: Query for aggregated TimeSeries data

Another useful TimeSeries pipeline query operator is aggregate which allows you to aggregate TimeSeries elements based on a calendar interval. For example, our raw_reads TimeSeries data is stored at 15 minute intervals, but you may want to query in a way that aggregates this data over each hour or over each day.

To specify an aggregate stage in the query pipeline, you must include a list of aggregation operations and a calendar that defines the interval to aggregate over. You can optionally also include start and/or end timestamps and a flags property as defined for the AggregateBy function.

The operations list defines the aggregation operations to perform on the TimeSeries data. It is a list of JSON documents that contain the follow properties:

Table 3. Aggregate Operations Format
Name	Description
op	Required. The aggregation operation to perform. Must be one of the following values: `AVG`, `SUM`, `MEDIAN`, `MIN`, `MAX`, `FIRST`, `LAST`, or `NTH`
key	Required. The field name to aggregate. In a BSON TimeSeries table, this must be the name of a field within the BSON document named payload. In a custom TimeSeries table, this must match to one of the column names in the row type.
index	Required only if the op is set to `NTH`. This specifies the index for the `NTH` operator. For example, if index is 5, then the `NTH` operator will return the fifth element of the TimeSeries for each calendar interval.
outKey	Optional. The field name to assign to the result of the aggregation operation in the query results. If not specified, the outKey will be the op value concatenated with an underscore and the key value.
outType	Optional. The data type of the result of the aggregation operation. If not specified, the result will a float.

The other required field for aggregate operations is the calendar field. This specifies the calender interval to aggregate over. For example, if you use a calendar with a one hour interval, all of the aggregate operations will be computed over hourly intervals.

The calendar can be specified as string name of an existing TimeSeries calendar. Alternatively, the calendar field can be specified a calendar pattern document. A pattern document must have an time unit field (second, minute, hour, day, week, month, or year). Then it can either have an integer frequency or a list of interval documents. When specified as interval documents, each document must have integer duration and a type (on or off). For example, a 6 hour calendar can be either specified with a frequency as

{"frequency": 6, "unit": "hour"
                                }

or with a list of interval documents as

{
                                "intervals": [ { "duration": 1, "type": "on" }, { "duration": 5,
                                "type": "off" } ], "unit": "hour" }

.

The following example computes the hourly average and maximum of the value field in the raw_reads TimeSeries.

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["loc_esi_id", "raw_reads"],
    "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" },
    "timeseriesFilter": {
        "field": "raw_reads",
        "pipeline": [
            { "aggregate": { 
                "operations": [ 
                        { "op": "AVG", "key": "value" }, 
                        { "op": "MAX", "key": "value" } 
                ], 
                "calendar": "ts_1hour"
                } 
            }
        ]
    }
}

Response

{
    "results": [
        {
            "loc_esi_id": "4727354321000111",
            "raw_reads": {
                "type": "regular",
                "origin": { "$date": "2010-11-10T06:00:00Z" },
                "pattern": {
                    "intervals": [
                        { "duration": 1, "type": "on" }
                    ],
                    "unit": "hour"
                },
                "elements": [
                    {
                        "timestamp": { "$date": "2010-11-10T06:00:00Z"  },
                        "avg_value": 0.08775,
                        "max_value": 0.092
                    },
                    {
                        "timestamp": { "$date": "2010-11-10T07:00:00Z" },
                        "avg_value": 0.088,
                        "max_value": 0.091
                    },
                    {
                        "timestamp": { "$date": "2010-11-10T08:00:00Z" },
                        "avg_value": 0.08825,
                        "max_value": 0.098
                    },
                    ...
                ],
                "elementsTruncated": true
            }
        }
    ],
    "hasMore": false,
    "responseTime": 120
}

The next example computes the daily maximum, minimum, first, and fifth elements of the value field in the raw_reads TimeSeries. In this example, though, we will specify the calendar using a pattern document, include start and end timestamps for our query, and specify outKey's to control the names of the fields in the query results.

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["loc_esi_id", "raw_reads"],
    "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" },
    "timeseriesFilter": {
        "field": "raw_reads",
        "pipeline": [
            { "aggregate": { 
                "operations": [ 
                    { "op": "MAX", "key": "value", "outKey": "max" }, 
                    { "op": "MIN", "key": "value", "outKey": "min" },
                    { "op": "FIRST", "key": "value", "outKey": "first" },
                    { "op": "NTH", "key": "value", "index": 5, "outKey": "fifth" }  
                ], 
                "calendar": { "frequency": 1, "unit": "day"}
                } 
            }
        ],
        "start": { "$date": "2010-11-20T00:00:00Z" },
        "end": { "$date": "2010-11-25T23:59:59Z" }
    }
}

Response

{
    "results": [
        {
            "loc_esi_id": "4727354321000111",
            "raw_reads": {
                "type": "regular",
                "origin": {
                    "$date": "2010-11-20T00:00:00Z"
                },
                "pattern": {
                    "intervals": [
                        { "duration": 1, "type": "on" }
                    ],
                    "unit": "day"
                },
                "elements": [
                  {
                        "timestamp": { "$date": "2010-11-20T00:00:00Z" },
                        "max": 4.504,
                        "min": 0.09,
                        "first": 0.559,
                        "fifth": 0.356
                    },
                    {
                        "timestamp": { "$date": "2010-11-21T00:00:00Z" },
                        "max": 1.525,
                        "min": 0.047,
                        "first": 0.065,
                        "fifth": 0.074
                    },
                    {
                        "timestamp": { "$date": "2010-11-22T00:00:00Z" },
                        "max": 1.866,
                        "min": 0.077,
                        "first": 0.313,
                        "fifth": 1.735
                    },
                    {
                        "timestamp": { "$date": "2010-11-23T00:00:00Z" },
                        "max": 2.546,
                        "min": 0.017,
                        "first": 0.169,
                        "fifth": 0.097
                    },
                    {
                        "timestamp": { "$date": "2010-11-24T00:00:00Z" },
                        "max": 1.956,
                        "min": 0.03,
                        "first": 0.094,
                        "fifth": 0.515
                    },
                    {
                        "timestamp": { "$date": "2010-11-25T00:00:00Z" },
                        "max": 1.376,
                        "min": 0.013,
                        "first": 0.025,
                        "fifth": 0.037
                    }
                ],
                "elementsTruncated": false
            }
        }
    ],
    "hasMore": false,
    "responseTime": 219
}

Example 7: Query for running aggregates of TimeSeries data

The next TimeSeries pipeline query operator to demonstrate is running_aggregate. A running_aggregate pipeline stage allows you to compute running sums, averages, medians, correlations, and variance over n number of sequential elements in your TimeSeries.

To specify a running_aggregate stage in the query pipeline, you must include a list of aggregation operations. You can optionally also include start and/or end timestamps.

The operations list defines the running aggregation operations to perform on the TimeSeries data. It is a list of JSON documents that contain the follow properties:

Table 4. Running Aggregate Operations Format
Name	Description
op	Required. The aggregation operation to perform. Must be one of the following values: `AVG`, `SUM`, `MEDIAN`, `CORRELATION`, or `VARIANCE`
key	Required. The field name to aggregate. In a BSON TimeSeries table, this must be the name of a field within the BSON document named payload. In a custom TimeSeries table, this must match to one of the column names in the row type.
key2	Applicable only to `CORRELATION` operations. For running correlations computations, the key2 is the field to correlate with the key field.
n	Required. The number of TimeSeries elements to use to compute the running aggregation.
outKey	Optional. The field name to assign to the result of the running aggregation operation in the query results. If not specified, the outKey will be the op value concatenated with an underscore and the key value.
outType	Optional. The data type of the result of the running aggregation operation. If not specified, the result will a float.

The following example computes a running average and running variance of the value field in the raw_reads TimeSeries stored in the ts_data table.

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["loc_esi_id", "raw_reads"],
    "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" },
    "timeseriesFilter": {
        "field": "raw_reads",
        "pipeline": [
            { 
                "running_aggregate": { 
                "operations": [ 
                    { "key": "value", "op": "AVG", "n": 12 }, 
                    { "key": "value", "op": "VARIANCE", "n": 12 }
                    ]
                }
            }
        ],
        "start": { "$date": "2010-11-15T10:00:00.000Z" },
        "end": { "$date": "2010-11-15T16:00:00Z" }
    }
}

Response

{
    "results": [
        {
            "loc_esi_id": "4727354321000111",
            "raw_reads": {
                "type": "regular",
                "origin": { "$date": "2010-11-15T16:00:00Z" },
                "pattern": {
                    "intervals": [
                        { "duration": 1, "type": "on" },
                        { "duration": 14, "type": "off" }
                    ],
                    "unit": "minute"
                },
                "elements": [
                    {
                        "timestamp": { "$date": "2010-11-15T10:00:00Z" },
                        "avg_value": 0.13199999928474426,
                        "variance_value": null
                    },
                    {
                        "timestamp": { "$date": "2010-11-15T10:15:00Z" },
                        "avg_value": 0.15200000256299973,
                        "variance_value": 8.000002622604585E-4
                    },
                    {
                        "timestamp": { "$date": "2010-11-15T10:30:00Z" },
                        "avg_value": 0.14600000282128653,
                        "variance_value": 5.080001218319049E-4
                    },
                    {
                        "timestamp": { "$date": "2010-11-15T10:45:00Z" },
                        "avg_value": 0.14250000193715096,
                        "variance_value": 3.876667726437355E-4
                    },
                    ...
                ],
                "elementsTruncated": false
            }
        }
    ],
    "hasMore": false,
    "responseTime": 256
}

Example 8: Query for TimeSeries data, applying an arithmetic expression to each element

Beyond aggregations and running aggregations on TimeSeries data, the REST API also supports applying a custom arithmetic expression to each element in the TimeSeries. This is based on the OneDB Apply function which applies a user-specified SQL expression or function to the matching TimeSeries elements.

To specify an apply stage in the query pipeline, you must include an expression field that defines a comma-separated list SQL expressions or functions to apply to the TimeSeries. Refer to the Apply function topic for more information on the syntax of this expression.

In addition to the expression field, you must also specify a cast for the result of your apply expression. Because the result expression can include any number of different output values, you must provide the output row type for your expression when applied to a TimeSeries. For example, set the cast property to timeseries(row(timestamp datetime year to fraction(5), result1 int, result2 float)) if your expression results in two fields, one being an integer and the other a float.

Like all of the other TimeSeries pipeline stages we have examined so far, the apply stage also supports optional start and/or end timestamps.

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["loc_esi_id", "raw_reads"],
    "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" },
    "timeseriesFilter": {
        "field": "raw_reads",
        "pipeline": [
            { 
                "apply": {
                    "expression": "$value * $value, 1 / $value",
                    "cast": "timeseries(row(timestamp datetime year to fraction(5), squared float, inverse float))"
                }
            }
        ],
        "start": { "$date": "2010-11-15T10:00:00Z" },
        "end": { "$date": "2010-11-15T11:00:00.000Z" }
    }
}

Response

{
    "results": [
        {
            "loc_esi_id": "4727354321000111",
            "raw_reads": {
                "type": "regular",
                "origin": { "$date": "2010-11-15T10:00:00Z" },
                "pattern": {
                    "intervals": [
                        { "duration": 1, "type": "on" },
                        { "duration": 14, "type": "off" }
                    ],
                    "unit": "minute"
                },
                "elements": [
                    {
                        "timestamp": { "$date": "2010-11-15T10:00:00Z" },
                        "squared": 0.017424,
                        "inverse": 7.575757575757576
                    },
                    {
                        "timestamp": { "$date": "2010-11-15T10:15:00Z" },
                        "squared": 0.029584,
                        "inverse": 5.813953488372093
                    },
                    {
                        "timestamp": { "$date": "2010-11-15T10:30:00Z" },
                        "squared": 0.017956,
                        "inverse": 7.462686567164179
                    },
                    {
                        "timestamp": { "$date": "2010-11-15T10:45:00Z" },
                        "squared": 0.017424,
                        "inverse": 7.575757575757576
                    },
                    {
                        "timestamp": { "$date": "2010-11-15T11:00:00Z" },
                        "squared": 0.023716,
                        "inverse": 6.4935064935064934
                    }
                ],
                "elementsTruncated": false
            }
        }
    ],
    "hasMore": false,
    "responseTime": 54
}

Example 9: Query for TimeSeries data to rollup elements across multiple TimeSeries

The final TimeSeries pipeline stage is rollup. Unlike all of the other operators covered so far (clip, aggregate, running_aggregate, and apply) which operate on one TimeSeries object (i.e. one row) at a time, the rollup operator aggregates across multiple TimeSeries objects (i.e. multiple rows). Let us say that our ts_data table is holding meter data and has a row in the table for each meter we have deployed out in the field. Running an aggregate operator on this data aggregates values for each meter individually. By contrast the rollup operator allows us to aggregate data across multiple meters, for example, computing an average usage across an entire zipcode.

A rollup stage in the query pipeline supports only two fields and both are required: a list of operations and a list of fields to groupBy.

The list of operations takes a similar form to what you have already seen with aggregate and running_aggregate. Each JSON document in the list of operations can include the follow properties:

Table 5. Rollup Operations Format
Name	Description
op	Required. The rollup operation to perform. Must be one of the following values: `AVG`, `SUM`, `COUNT`, `MIN`, `MAX`, `FIRST`, or `LAST`
key	Required. The field name to rollup. In a BSON TimeSeries table, this must be the name of a field within the BSON document named payload. In a custom TimeSeries table, this must match to one of the column names in the row type.
outKey	Optional. The field name to assign to the result of the rollup operation in the query results. If not specified, the outKey will be the op value concatenated with an underscore and the key value.
outType	Optional. The data type of the result of the rollup operation. If not specified, the result will a float.

The orderBy field in the rollup stage is also required. It is the list of fields to group by when aggregating across multiple TimeSeries objects. To aggregate over all TimeSeries objects in a table, provide an empty list as the groupBy field.

The following REST TimeSeries query example shows a rollup of all TimeSeries objects in the ts_data table whose direction column is "P". The rollup computes the average, sum, minimum, and maximum values across all of the matching TimeSeries objects in the table.

You will also notice in this example that we are specifying an outType for the COUNT field in order to receive the count as an integer instead of a float.

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["direction", "raw_reads"],
    "filter": { "key": "direction", "op": "=", "value": "P" },
    "timeseriesFilter": {
        "field": "raw_reads",
        "pipeline": [
            { 
                "rollup": {
                    "operations": [
                        { "op": "AVG", "key": "value" },
                        { "op": "SUM", "key": "value" },
                        { "op": "MAX", "key": "value" },
                        { "op": "MIN", "key": "value" },
                        { "op": "COUNT", "key": "value", "outKey": "count","outType": "int" }
                    ],
                    "groupBy": ["direction"]
                }
            }
        ]
    }
}

Response

 {
    "results": [
        {
            "direction": "P",
            "raw_reads": {
                "type": "regular",
                "origin": { "$date": "2010-11-10T06:00:00Z" },
                "pattern": {
                    "intervals": [
                        { "duration": 1, "type": "on" },
                        { "duration": 14, "type": "off" }
                    ],
                    "unit": "minute"
                },
                "elements": [
                    {
                        "timestamp": { "$date": "2010-11-10T06:00:00Z" },
                        "avg_value": 0.3727857142857143,
                        "sum_value": 10.438,
                        "max_value": 6.276,
                        "min_value": 0.0,
                        "count": 28
                    },
                    {
                        "timestamp": { "$date": "2010-11-10T06:15:00Z" },
                        "avg_value": 0.38957142857142857,
                        "sum_value": 10.908,
                        "max_value": 5.976,
                        "min_value": 0.0,
                        "count": 28
                    },
                    {
                        "timestamp": { "$date": "2010-11-10T06:30:00Z" },
                        "avg_value": 0.32571428571428573,
                        "sum_value": 9.12,
                        "max_value": 5.34,
                        "min_value": 0.0,
                        "count": 28
                    },
                    ...
                ],
                "elementsTruncated": true
            }
        }
    ],
    "hasMore": false,
    "responseTime": 1583
}

You may have noticed that there are no start and end timestamps supported as part of the rollup stage. If you want to limit a rollup to a particular time range, you need to add a clip stage before the rollup in the query pipeline. Which makes it a perfect time to move on to the next example which demonstrates chaining multiple pipeline stages into a single query.

Example 10: Query for TimeSeries data by linking multiple pipeline operators to create complex query conditions

The clip, aggregate, running_aggregate, apply, and rollup stages of the TimeSeries query pipeline already provide you a lot of different ways to query your TimeSeries data.

But what makes the TimeSeries query pipeline even more powerful than any particular stage is that you can combine as many of these stages as you like to make even more complex query conditions.

In the above examples, you may have noticed that the pipeline field is a list. Let us now try a TimeSeries query where we put more than one stage in that list.

Suppose we want to compute the maximum daily average usage of all meters whose direction property is "P" for each day in the month of December. We create a query pipeline with two stages. The first is an aggregate stage to compute the average usage per day for each meter during the month of December. Then the second rollup stage rolls up all of these daily averages to find the maximum. The top level filter of { "key": "direction", "op": "=", "value": "P" } ensures we are only doing this for meters where the direction is "P".

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["direction", "raw_reads"],
    "filter": { "key": "direction", "op": "=", "value": "P" },
    "timeseriesFilter": {
        "field": "raw_reads",
        "pipeline": [
           { 
                "aggregate": {
                    "operations": [
                        { "op": "AVG", "key": "value", "outKey": "avg" }
                    ],
                    "calendar": { "frequency": 1, "unit": "day" },
                }
            },
            { 
                "rollup": {
                    "operations": [
                        { "op": "MAX", "key": "avg", "outKey": "max_daily_avg" }
                    ],
                    "groupBy": ["direction"]
                }

            }
        ]
    }
}

Response

{
    "results": [
        {
            "direction": "P",
            "raw_reads": {
                "type": "regular",
                "origin": { "$date": "2010-12-01T00:00:00Z" },
                "pattern": {
                    "intervals": [
                        { "duration": 1, "type": "on" }
                    ],
                    "unit": "day"
                },
                "elements": [
                    {
                        "timestamp": { "$date": "2010-12-01T00:00:00Z" },
                        "max_daily_avg": 0.74740625
                    },
                    {
                        "timestamp": { "$date": "2010-12-02T00:00:00Z" },
                        "max_daily_avg": 2.5208541666666666
                    },
                    {
                        "timestamp": { "$date": "2010-12-03T00:00:00Z" },
                        "max_daily_avg": 3.034802083333333
                    },
                    ...
                    {
                        "timestamp": { "$date": "2010-12-31T00:00:00Z" },
                        "max_daily_avg": 1.56
                    }
                ],
                "elementsTruncated": false
            }
        }
    ],
    "hasMore": false,
    "responseTime": 59
}

Example 11: Query for the first or last TimeSeries element

Besides the query pipeline, the timeseriesFilter also supports a transform object that can define a transformation for the TimeSeries object returned the query. Whenever a transform is used, your result will no longer be a TimeSeries object, but some other data type.

The first transform option that we willl look at is the ability to return only a single element of the TimeSeries. There are two transform options to accomplish this: first and last.

If you want to retrieve the first TimeSeries element starting a given timestamp, you can run the following REST request. The request body includes "transform": { "op": "first" } in the timeseriesFilter.

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["loc_esi_id", "raw_reads"],
    "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" },
    "timeseriesFilter": {
        "field": "raw_reads",
        "start": { "$date": "2010-12-31T00:00:00Z" },
        "transform": { "op": "first" }
    }
}

Response

{
    "results": [
        {
            "loc_esi_id": "4727354321000111",
            "first": {
                "tstamp": { "$date": "2010-12-31T00:00:00Z" },
                "value": 0.092
            }
        }
    ],
    "hasMore": false,
    "responseTime": 31
}

Notice in the query result that we did not get a TimeSeries object for raw_reads, but instead just got a single data element from that TimeSeries object.

You can query for the last element in the same way. For the last transform operator though, the REST API supports an optional allowNulls property. By default allowNulls is true so if the last element within the TimeSeries window you are querying is null, you will receive a null in the response. If you want to receive the last non-null element, set allowNulls to false.

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["loc_esi_id", "raw_reads"],
    "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" },
    "timeseriesFilter": {
        "field": "raw_reads",
        "transform": { "op": "last", "allowNulls": false }
    }
}

Response

 {
    "results": [
        {
            "loc_esi_id": "4727354321000111",
            "last": {
                "tstamp": {
                    "$date": "2011-02-08T05:45:00Z"
                },
                "value": 1.412
            }
        }
    ],
    "hasMore": false,
    "responseTime": 3
}

Example 12: Query for the count of TimeSeries elements

In addition to extracting the first or last element out of a TimeSeries object, the REST API also supports counting the number of elements in a matching TimeSeries object through the transform option.

When using count as the transform operator, you can chose to count all elements or you can get a count of elements that match a specific condition.

To count all elements within a specific TimeSeries object, run a REST query request with "transform": { "op": "count" } as shown here.

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["loc_esi_id", "raw_reads"],
    "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" },
    "timeseriesFilter": {
        "field": "raw_reads",
        "start": { "$date": "2010-12-01T00:00:00Z" },
        "end": { "$date": "2010-12-31T00:00:00Z" },
        "transform": { "op": "count" }
    }
}

Response

{
    "results": [
        {
            "loc_esi_id": "4727354321000111",
            "count": 2881
        }
    ],
    "hasMore": false,
    "responseTime": 5
}

To count for elements that meet a certain condition, add an expression to the transform document. For example,"transform": { "op": "count", "expression": "value > 1" } to receive a count of all TimeSeries elements where the value greater than one. Refer to the documentation on the OneDB CountIf function for more examples of expressions that can be passed to this function.

Request

POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query

Request Body

{
    "fields": ["loc_esi_id", "raw_reads"],
    "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" },
    "timeseriesFilter": {
        "field": "raw_reads",
        "start": { "$date": "2010-12-01T00:00:00Z" },
        "end": { "$date": "2010-12-31T00:00:00Z" },
        "transform": { "op": "count", "expression": "value > 1" }
    }
}

Response

{
    "results": [
        {
            "loc_esi_id": "4727354321000111",
            "count": 201
        }
    ],
    "hasMore": false,
    "responseTime": 5
}