REST API Examples: TimeSeries Queries
This topic provides a tutorial on querying TimeSeries data through REST.
The REST API provides a rich set of options for querying TimeSeries data in HCL OneDB™. You can retrieve the TimeSeries elements from the database exactly as they are stored, you can query for data between start and end timestamps, you can aggregate TimeSeries data both across different elements and across different TimeSeries objects, and you can run custom arthimetic expressions against your TimeSeries elements.
The REST API server supports both BSON TimeSeries row types and custom TimeSeries row types. A BSON TimeSeries row type includes a timestamp and a BSON field. The BSON field provides a flexible schema to hold any fields you need for your TimeSeries data, both now and in the future. BSON TimeSeries tables can be created automatically through the REST API. You can also create a custom TimeSeries row type through SQL that defines the particular fields and data types that make up your TimeSeries elements. Custom row types do not have the same flexibility as BSON row types, but can be customized to your exact data structure. Whichever route you choose for structuring your TimeSeries data, the REST API will detect your TimeSeries schema and automatically write the SQL for your REST queries accordingly.
The examples in this topic focus exclusively on querying TimeSeries data. For examples on creating, inserting, loading, and updating TimeSeries data, or on working with TimeSeries calendars, see REST API Examples: TimeSeries Tables and Calendars.
This topic will demonstrate various queries against the ts_data table in the stores_demo database. If you would like to run these same queries against your REST API server, refer to the Demonstration databases topic for information on how to create the stores_demo database.
/api/servers/{alias}/databases/{dbname}/tables/{tabname}/query
path or the
/api/servers/{alias}/databases/{dbname}/timeseries/tables/{tabname}/query
path. Both APIs allow you to query your TimeSeries tables
and both support the same options. The examples below use
the timeseries/tables/{tabname}/query
path;
but the exact same examples could be run against the
tables/{tabname}/query
path
too.- Example 1: Query for TimeSeries data
- Example 2: Query for TimeSeries data with a TimeSeries filter
- Example 3: Query for TimeSeries data with limit and skip
- Example 4: Query for TimeSeries data with start and end timestamps
- Example 5: Query for TimeSeries data with start and end timestamps (pipeline version)
- Example 6: Query for aggregated TimeSeries data
- Example 7: Query for running aggregates of TimeSeries data
- Example 8: Query for TimeSeries data, applying an arithmetic expression to each element
- Example 9: Query for TimeSeries data to rollup elements across multiple TimeSeries
- Example 10: Query for TimeSeries data by linking multiple pipeline operators to create complex query conditions
- Example 11: Query for the first or last TimeSeries element
- Example 12: Query for the count of TimeSeries elements
Example 1: Query for TimeSeries data
You query a TimeSeries table the same way you query any other relational table. POST a JSON document to the query url for the table where the contents of the request body defines the parameters of your query. The supported query options are shown in the following table:
Key | Description |
---|---|
fields | A list of column names to include in the
query results, for example ["loc_esi_id",
"measure_unit", "direction"
,"raw_reads"] . |
filter | A query filter document that defines either
a comparison query filter or a logical query filter. For more
information, see REST
Relational Table Example: Querying a table with a
query filter. Note: This filter
field affects the rows returned by the query. The
timeseriesFilter field below affects the
TimeSeries elements within those rows that are
returned by the query. |
timeseriesFilter | A filter that specific to your TimeSeries
column. This can include start and/or
end timestamps, a query pipeline, a
transform function, and limit and
skip properties specific to iterating
through your TimeSeries elements. See the table
below for more information on the
timeseriesFilter. Note: This
timeseriesFilter field affects the
TimeSeries elements returned by the query. By
constrast, the filter field above affects
which rows are returned by the query.
|
limit |
The maximum number of rows to return from a SQL query. Keep in mind that this is the limit on the rows in the table returned by the query, which is different from the number of TimeSeries elements included in each result row. The REST API server's default limit is 100 rows, but this can be configured in the REST configuration file. You can also override it on a per query basis by setting this limit property in your query request body. |
skip | The number of rows to skip. Keep in mind that this is the skip value on the rows in the table returned by the query, which is different from the skip value applied to the TimeSeries elements included in each result row. |
orderBy | A list of documents defining the columns to orderBy. Each document in the list must include key and direction properties to indicate which column name and which direction ("asc" for ascending or "desc" for descending) to order the results by. See REST Relational Table Examples: Query a table using projection and order by fields for more information. |
The following REST example queries for the loc_esi_id
,
measure_unit
, direction
,
and raw_reads
columns of the ts_data table
where loc_esi_id
equals "4727354321000111". Because
there is no timeseriesFilter in this query request body, this
query will return the raw_reads
TimeSeries elements
exactly they are stored in the database for the matching row.
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["loc_esi_id", "measure_unit", "direction" ,"raw_reads"], "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" } }
- Response
-
{ "results": [ { "loc_esi_id": "4727354321000111", "measure_unit": "KWH", "direction": "P", "raw_reads": { "type": "regular", "origin": { "$date": "2010-11-10T06:00:00Z" }, "pattern": { "intervals": [ { "duration": 1, "type": "on" }, { "duration": 14, "type": "off" } ], "unit": "minute" }, "elements": [ { "tstamp": { "$date": "2010-11-10T06:00:00Z" }, "value": { "$numberDecimal": "0.092" } }, { "tstamp": { "$date": "2010-11-10T06:15:00Z" }, "value": { "$numberDecimal": "0.084" } }, { "tstamp": { "$date": "2010-11-10T06:30:00Z" }, "value": { "$numberDecimal": "0.09" } }, ... ], "elementsTruncated": true } } ], "hasMore": false, "responseTime": 194 }
raw_reads
TimeSeries object, you
will notice that there are a few metadata properties about
the TimeSeries object in addition to the TimeSeries data
elements themselves. Every time a TimeSeries object is
returned by the REST API, it will always include a
type ("regular" or "irregular") property and
an elements property holding the actual time series
data. If it is a regular TimeSeries, meaning there is a
fixed calendar interval for the time series data, there will
also be origin and pattern properties
describing the origin date and calendar interval for that
time series data. If it is an irregular TimeSeries object,
then no additional properties will be provided beyond the
type and the data elements.Example 2: Query for TimeSeries data with a TimeSeries filter
Use the timeseriesFilter property to filter or manipulate the TimeSeries data elements returned by your query. The following table describes the options you can specify in the timeseriesFilter document.
Property | Description |
---|---|
field | The name of the TimeSeries column name that this filter applies to. This property is required if there is more than one TimeSeries column in the table. It is optional if there is only one TimeSeries column in the table. |
start | A start timestamp for the TimeSeries elements to return. If not set, the query will return elements starting the earliest timestamp in the TimeSeries. |
end | An end timestamp for the TimeSeries elements to return. If not set, the query will return elements up to the latest timestamp in the TimeSeries. |
pipeline | A query pipeline for the TimeSeries table. The query pipeline allows to you aggregate or transform your TimeSeries data in a particular way. Query pipeline stages include aggregate, running_aggregate, clip, apply, and rollup. Examples of each of these are shown later on in this tutorial. Query pipeline stages can be chained together in a list, allowing you to aggregate your TimeSeries data in complex ways. |
transform |
An option for transforming an array of TimeSeries elements into a single value or into a single element. If a pipeline is also specified, the transform is applied after the query pipeline. Transform options include count, which allows you to get a count of all elements or of all elements that match a condition, and first and last which allows you to return only the first or last element of the given TimeSeries query. |
limit |
The maximum number of TimeSeries elements included in each TimeSeries object returned by the query. The REST API server's default limit is 100 TimeSeries elements, but this can be configured in the REST configuration file. You can also override it on a per query basis by setting this limit within the timeseriesFilterproperty in your query request body. |
skip | The number of TimeSeries elements to skip. |
Here is an example using the field, start, and end options of the timeseriesFilter.
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["loc_esi_id", "raw_reads"], "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" }, "timeseriesFilter": { "field": "raw_reads", "start": { "$date": "2010-12-01T11:00:00Z" }, "end": { "$date": "2010-12-01T12:00:00Z" } } }
- Response
-
{ "results": [ { "loc_esi_id": "4727354321000111", "raw_reads": { "type": "regular", "origin": { "$date": "2010-12-01T11:00:00Z" }, "pattern": { "intervals": [ { "duration": 1, "type": "on" }, { "duration": 14, "type": "off" } ], "unit": "minute" }, "elements": [ { "tstamp": { "$date": "2010-12-01T11:00:00Z" }, "value": { "$numberDecimal": "0.072" } }, { "tstamp": { "$date": "2010-12-01T11:15:00Z" }, "value": { "$numberDecimal": "0.163" } }, { "tstamp": { "$date": "2010-12-01T11:30:00Z" }, "value": { "$numberDecimal": "0.178" } }, { "tstamp": { "$date": "2010-12-01T11:45:00Z" }, "value": { "$numberDecimal": "0.192" } }, { "tstamp": { "$date": "2010-12-01T12:00:00Z" }, "value": { "$numberDecimal": "0.245" } } ], "elementsTruncated": false } } ], "hasMore": false, "responseTime": 19 }
Example 3: Query for TimeSeries data with limit and skip
There are two types of limit's and skip's that you can set when querying TimeSeries data. The first is a limit and skip on the rows in the table that match the provided filter. This limit and skip works just like REST Relational Table Examples: Query a table using limit and skip. In the response, you will have a hasMore property that tells you if there are more rows remaining in your query that you can retrieve by reissuing the same query with a increased skip value.
The second limit and skip is the one for the TimeSeries elements. When you query a TimeSeries table, each row in the table has at least one column that contains a TimeSeries object. Each of those individual TimeSeries objects could contain hundreds, thousands, millions, or more individual TimeSeries elements. The limit property that you set in the timeseriesFilter sets a maximum on the number of elements you can receive in a single request and the skip property in the timeseriesFilter allows you to page through batches of TimeSeries elements across multiple requests.
To determine if there are more TimeSeries elements to retrieve, check the
elementsTruncated field. If you scroll up, you will
notice that in Example 2, the elementsTruncated field is
false
, which means that all TimeSeries
elements that matched the query were included in the HTTP response.
However, if you scroll even farther up to Example 1, you
will notice that the elementsTruncated field is
true
. That query had no filters or
start/end timestamp bounds on the TimeSeries data in the
raw_reads
column. Since there is a lot of
data in that TimeSeries, the default limit of 100 elements was
exceeded. As a result, the HTTP response included only the first 100
TimeSeries elements and had its elementsTruncated field set
to true
. To retrieve the next batch of results just
add the skip field in the timeseriesFilter and reissue
the REST query as shown here.
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["loc_esi_id", "measure_unit", "direction" ,"raw_reads"], "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" }, "timeseriesFilter": { "skip": 100, "limit": 100 } }
- Response
-
{ "results": [ { "loc_esi_id": "4727354321000111", "measure_unit": "KWH", "direction": "P", "raw_reads": { "type": "regular", "origin": { "$date": "2010-11-10T06:00:00Z" }, "pattern": { "intervals": [ { "duration": 1, "type": "on" }, { "duration": 14, "type": "off" } ], "unit": "minute" }, "elements": [ { "tstamp": { "$date": "2010-11-11T07:00:00Z" }, "value": { "$numberDecimal": "0.013" } }, { "tstamp": { "$date": "2010-11-11T07:15:00Z" }, "value": { "$numberDecimal": "0.012" } }, { "tstamp": { "$date": "2010-11-11T07:30:00Z" }, "value": { "$numberDecimal": "0.013" } }, { "tstamp": { "$date": "2010-11-11T07:45:00Z" }, "value": { "$numberDecimal": "0.012" } }, ... ], "elementsTruncated": true } } ], "hasMore": false, "responseTime": 368 }
Notice again in the response that elementsTruncated is still
true
. You can continue to iterate through
batches of elements by increasing the skip each time until
the last batch has been received as indicated by
elementsTruncated=false
.
Example 4: Query for TimeSeries data with start and end timestamps
Oftentimes in TimeSeries queries, you will want to query within certain timestamps. The easiest way to limit your TimeSeries query to a specific time range is to add start and/or end timestamps to the query. These start and end timestamps are inclusive. You can provide one, or the other, or both timestamps. In absence of a start timestamp, the query data will start at the earliest timestamp in TimeSeries object. In absence of an end timestamp, the query data will end at the latest timestamp in TimeSeries object.
The following example shows a query of our ts_data table for all TimeSeries entries between 10:00am and 10:30am on November 15, 2010. As raw_reads data in the table is based on a 15 minute calendar interval, this query returns three TimeSeries elements between these timestamps since the timestamps are inclusive.
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["loc_esi_id", "raw_reads"], "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" }, "timeseriesFilter": { "field": "raw_reads", "start": { "$date": "2010-11-15T10:00:00Z" }, "end": { "$date": "2010-11-15T10:30:00Z" } } }
- Response
-
{ "results": [ { "loc_esi_id": "4727354321000111", "raw_reads": { "type": "regular", "origin": { "$date": "2010-11-15T10:00:00Z" }, "pattern": { "intervals": [ { "duration": 1, "type": "on" }, { "duration": 14, "type": "off" } ], "unit": "minute" }, "elements": [ { "tstamp": { "$date": "2010-11-15T10:00:00Z" }, "value": { "$numberDecimal": "0.132" } }, { "tstamp": { "$date": "2010-11-15T10:15:00Z" }, "value": { "$numberDecimal": "0.172" } }, { "tstamp": { "$date": "2010-11-15T10:30:00Z" }, "value": { "$numberDecimal": "0.134" } } ], "elementsTruncated": false } } ], "hasMore": false, "responseTime": 146 }
- Using the MongoDB Extended
JSON v2 format where the date is specified
inside of a "$date" field either as {
"start": { "$date": "2010-11-15T10:00:00Z" } }
or{ "start": { "$date": 1289815200000 } }
. This is the format that the REST API will use when returning dates in HTTP responses. - You can skip the "$date" and just specify the
number of milliseconds from the Unix epoch:
{ "start": 1289815200000 }
- You can also specify the date as a string (but
still without the "$date") using either one of
these two formats:
{ "start": "2010-11-15 10:00:00.000" }
or{ "start": "2010-11-15T10:30:00.000Z" }
.
Example 5: Query for TimeSeries data with start and end timestamps (pipeline version)
When querying TimeSeries data, you can provide a pipeline that will filter your TimeSeries data through one or more query pipeline operators. This and the next five examples will show various options for TimeSeries pipeline queries.
One TimeSeries query pipeline operator is the clip operator. This operator clips the TimeSeries data between a start and/or end timestamp. Therefore we can write the exact same query as Example 4 in the following way using a clip stage in the pipeline.
To specify a clip stage, you must include a start and/or end timestamp. Plus you can optionally include a flags property as defined for the Clip function in OneDB.
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["loc_esi_id", "raw_reads"], "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" }, "timeseriesFilter": { "field": "raw_reads", "pipeline": [ { "clip": { "start": { "$date": "2010-11-15T10:00:00Z" }, "end": { "$date": "2010-11-15T10:30:00Z" } } } ] } }
- Response
-
{ "results": [ { "loc_esi_id": "4727354321000111", "raw_reads": { "type": "regular", "origin": { "$date": "2010-11-15T10:00:00Z" }, "pattern": { "intervals": [ { "duration": 1, "type": "on" }, { "duration": 14, "type": "off" } ], "unit": "minute" }, "elements": [ { "tstamp": { "$date": "2010-11-15T10:00:00Z" }, "value": { "$numberDecimal": "0.132" } }, { "tstamp": { "$date": "2010-11-15T10:15:00Z" }, "value": { "$numberDecimal": "0.172" } }, { "tstamp": { "$date": "2010-11-15T10:30:00Z" }, "value": { "$numberDecimal": "0.134" } } ], "elementsTruncated": false } } ], "hasMore": false, "responseTime": 18 }
Example 6: Query for aggregated TimeSeries data
Another useful TimeSeries pipeline query operator is
aggregate which allows you to aggregate TimeSeries
elements based on a calendar interval. For example, our
raw_reads
TimeSeries data is stored at 15
minute intervals, but you may want to query in a way that aggregates
this data over each hour or over each day.
To specify an aggregate stage in the query pipeline, you must include a list of aggregation operations and a calendar that defines the interval to aggregate over. You can optionally also include start and/or end timestamps and a flags property as defined for the AggregateBy function.
The operations list defines the aggregation operations to perform on the TimeSeries data. It is a list of JSON documents that contain the follow properties:
Name | Description |
---|---|
op | Required. The aggregation operation to
perform. Must be one of the following
values:
|
key |
Required. The field name to aggregate. In a BSON TimeSeries table, this must be the name of a field within the BSON document named payload. In a custom TimeSeries table, this must match to one of the column names in the row type. |
index | Required only if the op is set to
NTH . This specifies the index for
the NTH operator.For example,
if index is 5, then the
|
outKey | Optional. The field name to assign to the
result of the aggregation operation in the query
results. If not specified, the outKey will be the op value concatenated with an underscore and the key value. |
outType | Optional. The data type of the result of the aggregation operation. If not specified, the result will a float. |
The other required field for aggregate operations is the calendar field. This specifies the calender interval to aggregate over. For example, if you use a calendar with a one hour interval, all of the aggregate operations will be computed over hourly intervals.
The calendar can be specified as string name of an existing TimeSeries calendar. Alternatively, the calendar field can be specified a calendar pattern document. A pattern document must have an time unit field (second
, minute
,
hour
, day
, week
,
month
, or year
). Then it can
either have an integer frequency or a list of interval
documents. When specified as interval documents, each document must
have integer duration and a type (on
or
off
). For example, a 6 hour calendar can be either
specified with a frequency as {"frequency": 6, "unit": "hour"
}
or with a list of interval documents as {
"intervals": [ { "duration": 1, "type": "on" }, { "duration": 5,
"type": "off" } ], "unit": "hour" }
. The following
example computes the hourly average and maximum of the
value
field in the
raw_reads
TimeSeries.
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["loc_esi_id", "raw_reads"], "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" }, "timeseriesFilter": { "field": "raw_reads", "pipeline": [ { "aggregate": { "operations": [ { "op": "AVG", "key": "value" }, { "op": "MAX", "key": "value" } ], "calendar": "ts_1hour" } } ] } }
- Response
-
{ "results": [ { "loc_esi_id": "4727354321000111", "raw_reads": { "type": "regular", "origin": { "$date": "2010-11-10T06:00:00Z" }, "pattern": { "intervals": [ { "duration": 1, "type": "on" } ], "unit": "hour" }, "elements": [ { "timestamp": { "$date": "2010-11-10T06:00:00Z" }, "avg_value": 0.08775, "max_value": 0.092 }, { "timestamp": { "$date": "2010-11-10T07:00:00Z" }, "avg_value": 0.088, "max_value": 0.091 }, { "timestamp": { "$date": "2010-11-10T08:00:00Z" }, "avg_value": 0.08825, "max_value": 0.098 }, ... ], "elementsTruncated": true } } ], "hasMore": false, "responseTime": 120 }
The next example computes the daily maximum, minimum, first, and fifth
elements of the value
field in the
raw_reads
TimeSeries. In this example,
though, we will specify the calendar using a pattern
document, include start and end timestamps for our
query, and specify outKey's to control the names of the
fields in the query results.
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["loc_esi_id", "raw_reads"], "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" }, "timeseriesFilter": { "field": "raw_reads", "pipeline": [ { "aggregate": { "operations": [ { "op": "MAX", "key": "value", "outKey": "max" }, { "op": "MIN", "key": "value", "outKey": "min" }, { "op": "FIRST", "key": "value", "outKey": "first" }, { "op": "NTH", "key": "value", "index": 5, "outKey": "fifth" } ], "calendar": { "frequency": 1, "unit": "day"} } } ], "start": { "$date": "2010-11-20T00:00:00Z" }, "end": { "$date": "2010-11-25T23:59:59Z" } } }
- Response
-
{ "results": [ { "loc_esi_id": "4727354321000111", "raw_reads": { "type": "regular", "origin": { "$date": "2010-11-20T00:00:00Z" }, "pattern": { "intervals": [ { "duration": 1, "type": "on" } ], "unit": "day" }, "elements": [ { "timestamp": { "$date": "2010-11-20T00:00:00Z" }, "max": 4.504, "min": 0.09, "first": 0.559, "fifth": 0.356 }, { "timestamp": { "$date": "2010-11-21T00:00:00Z" }, "max": 1.525, "min": 0.047, "first": 0.065, "fifth": 0.074 }, { "timestamp": { "$date": "2010-11-22T00:00:00Z" }, "max": 1.866, "min": 0.077, "first": 0.313, "fifth": 1.735 }, { "timestamp": { "$date": "2010-11-23T00:00:00Z" }, "max": 2.546, "min": 0.017, "first": 0.169, "fifth": 0.097 }, { "timestamp": { "$date": "2010-11-24T00:00:00Z" }, "max": 1.956, "min": 0.03, "first": 0.094, "fifth": 0.515 }, { "timestamp": { "$date": "2010-11-25T00:00:00Z" }, "max": 1.376, "min": 0.013, "first": 0.025, "fifth": 0.037 } ], "elementsTruncated": false } } ], "hasMore": false, "responseTime": 219 }
Example 7: Query for running aggregates of TimeSeries data
The next TimeSeries pipeline query operator to demonstrate is running_aggregate. A running_aggregate pipeline stage allows you to compute running sums, averages, medians, correlations, and variance over n number of sequential elements in your TimeSeries.
To specify a running_aggregate stage in the query pipeline, you must include a list of aggregation operations. You can optionally also include start and/or end timestamps.
The operations list defines the running aggregation operations to perform on the TimeSeries data. It is a list of JSON documents that contain the follow properties:
Name | Description |
---|---|
op | Required. The aggregation
operation to perform. Must be one of the following
values:
|
key |
Required. The field name to aggregate. In a BSON TimeSeries table, this must be the name of a field within the BSON document named payload. In a custom TimeSeries table, this must match to one of the column names in the row type. |
key2 | Applicable only to
CORRELATION operations. For
running correlations computations, the key2
is the field to correlate with the key
field. |
n |
Required. The number of TimeSeries elements to use to compute the running aggregation. |
outKey | Optional. The field name to assign to the
result of the running aggregation operation in the
query results. If not specified, the outKey will be the op value concatenated with an underscore and the key value. |
outType | Optional. The data type of the result of the running aggregation operation. If not specified, the result will a float. |
The following example computes a running average and running variance of
the value
field in the raw_reads
TimeSeries stored in the ts_data table.
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["loc_esi_id", "raw_reads"], "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" }, "timeseriesFilter": { "field": "raw_reads", "pipeline": [ { "running_aggregate": { "operations": [ { "key": "value", "op": "AVG", "n": 12 }, { "key": "value", "op": "VARIANCE", "n": 12 } ] } } ], "start": { "$date": "2010-11-15T10:00:00.000Z" }, "end": { "$date": "2010-11-15T16:00:00Z" } } }
- Response
-
{ "results": [ { "loc_esi_id": "4727354321000111", "raw_reads": { "type": "regular", "origin": { "$date": "2010-11-15T16:00:00Z" }, "pattern": { "intervals": [ { "duration": 1, "type": "on" }, { "duration": 14, "type": "off" } ], "unit": "minute" }, "elements": [ { "timestamp": { "$date": "2010-11-15T10:00:00Z" }, "avg_value": 0.13199999928474426, "variance_value": null }, { "timestamp": { "$date": "2010-11-15T10:15:00Z" }, "avg_value": 0.15200000256299973, "variance_value": 8.000002622604585E-4 }, { "timestamp": { "$date": "2010-11-15T10:30:00Z" }, "avg_value": 0.14600000282128653, "variance_value": 5.080001218319049E-4 }, { "timestamp": { "$date": "2010-11-15T10:45:00Z" }, "avg_value": 0.14250000193715096, "variance_value": 3.876667726437355E-4 }, ... ], "elementsTruncated": false } } ], "hasMore": false, "responseTime": 256 }
Example 8: Query for TimeSeries data, applying an arithmetic expression to each element
Beyond aggregations and running aggregations on TimeSeries data, the REST API also supports applying a custom arithmetic expression to each element in the TimeSeries. This is based on the OneDB Apply function which applies a user-specified SQL expression or function to the matching TimeSeries elements.
To specify an apply stage in the query pipeline, you must include an expression field that defines a comma-separated list SQL expressions or functions to apply to the TimeSeries. Refer to the Apply function topic for more information on the syntax of this expression.
In addition to the expression field, you must also specify a
cast for the result of your apply expression. Because
the result expression can include any number of different output
values, you must provide the output row type for your expression
when applied to a TimeSeries. For example, set the cast
property to timeseries(row(timestamp datetime year to
fraction(5), result1 int, result2 float))
if your
expression results in two fields, one being an integer and the other
a float.
Like all of the other TimeSeries pipeline stages we have examined so far, the apply stage also supports optional start and/or end timestamps.
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["loc_esi_id", "raw_reads"], "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" }, "timeseriesFilter": { "field": "raw_reads", "pipeline": [ { "apply": { "expression": "$value * $value, 1 / $value", "cast": "timeseries(row(timestamp datetime year to fraction(5), squared float, inverse float))" } } ], "start": { "$date": "2010-11-15T10:00:00Z" }, "end": { "$date": "2010-11-15T11:00:00.000Z" } } }
- Response
-
{ "results": [ { "loc_esi_id": "4727354321000111", "raw_reads": { "type": "regular", "origin": { "$date": "2010-11-15T10:00:00Z" }, "pattern": { "intervals": [ { "duration": 1, "type": "on" }, { "duration": 14, "type": "off" } ], "unit": "minute" }, "elements": [ { "timestamp": { "$date": "2010-11-15T10:00:00Z" }, "squared": 0.017424, "inverse": 7.575757575757576 }, { "timestamp": { "$date": "2010-11-15T10:15:00Z" }, "squared": 0.029584, "inverse": 5.813953488372093 }, { "timestamp": { "$date": "2010-11-15T10:30:00Z" }, "squared": 0.017956, "inverse": 7.462686567164179 }, { "timestamp": { "$date": "2010-11-15T10:45:00Z" }, "squared": 0.017424, "inverse": 7.575757575757576 }, { "timestamp": { "$date": "2010-11-15T11:00:00Z" }, "squared": 0.023716, "inverse": 6.4935064935064934 } ], "elementsTruncated": false } } ], "hasMore": false, "responseTime": 54 }
Example 9: Query for TimeSeries data to rollup elements across multiple TimeSeries
The final TimeSeries pipeline stage is rollup. Unlike all of the other operators covered so far (clip, aggregate, running_aggregate, and apply) which operate on one TimeSeries object (i.e. one row) at a time, the rollup operator aggregates across multiple TimeSeries objects (i.e. multiple rows). Let us say that our ts_data table is holding meter data and has a row in the table for each meter we have deployed out in the field. Running an aggregate operator on this data aggregates values for each meter individually. By contrast the rollup operator allows us to aggregate data across multiple meters, for example, computing an average usage across an entire zipcode.
A rollup stage in the query pipeline supports only two fields and both are required: a list of operations and a list of fields to groupBy.
The list of operations takes a similar form to what you have already seen with aggregate and running_aggregate. Each JSON document in the list of operations can include the follow properties:
Name | Description |
---|---|
op | Required. The rollup operation to perform.
Must be one of the following
values:
|
key |
Required. The field name to rollup. In a BSON TimeSeries table, this must be the name of a field within the BSON document named payload. In a custom TimeSeries table, this must match to one of the column names in the row type. |
outKey | Optional. The field name to assign to the
result of the rollup operation in the query
results. If not specified, the outKey will be the op value concatenated with an underscore and the key value. |
outType | Optional. The data type of the result of the rollup operation. If not specified, the result will a float. |
The orderBy field in the rollup stage is also required. It is the list of fields to group by when aggregating across multiple TimeSeries objects. To aggregate over all TimeSeries objects in a table, provide an empty list as the groupBy field.
The following REST TimeSeries query example shows a rollup of all
TimeSeries objects in the ts_data table whose
direction
column is "P". The rollup
computes the average, sum, minimum, and maximum values across all of
the matching TimeSeries objects in the table.
You will also notice in this example that we are specifying an
outType for the COUNT
field in order
to receive the count as an integer instead of a float.
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["direction", "raw_reads"], "filter": { "key": "direction", "op": "=", "value": "P" }, "timeseriesFilter": { "field": "raw_reads", "pipeline": [ { "rollup": { "operations": [ { "op": "AVG", "key": "value" }, { "op": "SUM", "key": "value" }, { "op": "MAX", "key": "value" }, { "op": "MIN", "key": "value" }, { "op": "COUNT", "key": "value", "outKey": "count","outType": "int" } ], "groupBy": ["direction"] } } ] } }
- Response
-
{ "results": [ { "direction": "P", "raw_reads": { "type": "regular", "origin": { "$date": "2010-11-10T06:00:00Z" }, "pattern": { "intervals": [ { "duration": 1, "type": "on" }, { "duration": 14, "type": "off" } ], "unit": "minute" }, "elements": [ { "timestamp": { "$date": "2010-11-10T06:00:00Z" }, "avg_value": 0.3727857142857143, "sum_value": 10.438, "max_value": 6.276, "min_value": 0.0, "count": 28 }, { "timestamp": { "$date": "2010-11-10T06:15:00Z" }, "avg_value": 0.38957142857142857, "sum_value": 10.908, "max_value": 5.976, "min_value": 0.0, "count": 28 }, { "timestamp": { "$date": "2010-11-10T06:30:00Z" }, "avg_value": 0.32571428571428573, "sum_value": 9.12, "max_value": 5.34, "min_value": 0.0, "count": 28 }, ... ], "elementsTruncated": true } } ], "hasMore": false, "responseTime": 1583 }
You may have noticed that there are no start and end timestamps supported as part of the rollup stage. If you want to limit a rollup to a particular time range, you need to add a clip stage before the rollup in the query pipeline. Which makes it a perfect time to move on to the next example which demonstrates chaining multiple pipeline stages into a single query.
Example 10: Query for TimeSeries data by linking multiple pipeline operators to create complex query conditions
The clip, aggregate, running_aggregate, apply, and rollup stages of the TimeSeries query pipeline already provide you a lot of different ways to query your TimeSeries data.
But what makes the TimeSeries query pipeline even more powerful than any particular stage is that you can combine as many of these stages as you like to make even more complex query conditions.
In the above examples, you may have noticed that the pipeline field is a list. Let us now try a TimeSeries query where we put more than one stage in that list.
Suppose we want to compute the maximum daily average usage of all meters
whose direction
property is "P" for each day in the
month of December. We create a query pipeline with two stages. The
first is an aggregate stage to compute the average usage per
day for each meter during the month of December. Then the second
rollup stage rolls up all of these daily averages to
find the maximum. The top level filter of { "key":
"direction", "op": "=", "value": "P" }
ensures we
are only doing this for meters where the direction
is "P".
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["direction", "raw_reads"], "filter": { "key": "direction", "op": "=", "value": "P" }, "timeseriesFilter": { "field": "raw_reads", "pipeline": [ { "aggregate": { "operations": [ { "op": "AVG", "key": "value", "outKey": "avg" } ], "calendar": { "frequency": 1, "unit": "day" }, } }, { "rollup": { "operations": [ { "op": "MAX", "key": "avg", "outKey": "max_daily_avg" } ], "groupBy": ["direction"] } } ] } }
- Response
-
{ "results": [ { "direction": "P", "raw_reads": { "type": "regular", "origin": { "$date": "2010-12-01T00:00:00Z" }, "pattern": { "intervals": [ { "duration": 1, "type": "on" } ], "unit": "day" }, "elements": [ { "timestamp": { "$date": "2010-12-01T00:00:00Z" }, "max_daily_avg": 0.74740625 }, { "timestamp": { "$date": "2010-12-02T00:00:00Z" }, "max_daily_avg": 2.5208541666666666 }, { "timestamp": { "$date": "2010-12-03T00:00:00Z" }, "max_daily_avg": 3.034802083333333 }, ... { "timestamp": { "$date": "2010-12-31T00:00:00Z" }, "max_daily_avg": 1.56 } ], "elementsTruncated": false } } ], "hasMore": false, "responseTime": 59 }
Example 11: Query for the first or last TimeSeries element
Besides the query pipeline, the timeseriesFilter also supports a transform object that can define a transformation for the TimeSeries object returned the query. Whenever a transform is used, your result will no longer be a TimeSeries object, but some other data type.
The first transform option that we willl look at is the ability to return only a single element of the TimeSeries. There are two transform options to accomplish this: first and last.
If you want to retrieve the first TimeSeries element starting a
given timestamp, you can run the following REST request. The request
body includes "transform": { "op": "first" }
in the
timeseriesFilter.
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["loc_esi_id", "raw_reads"], "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" }, "timeseriesFilter": { "field": "raw_reads", "start": { "$date": "2010-12-31T00:00:00Z" }, "transform": { "op": "first" } } }
- Response
-
{ "results": [ { "loc_esi_id": "4727354321000111", "first": { "tstamp": { "$date": "2010-12-31T00:00:00Z" }, "value": 0.092 } } ], "hasMore": false, "responseTime": 31 }
Notice in the query result that we did not get a TimeSeries object for
raw_reads
, but instead just got a single
data element from that TimeSeries object.
You can query for the last element in the same way. For the
last transform operator though, the REST API supports
an optional allowNulls property. By default allowNulls
is true
so if the last element within the
TimeSeries window you are querying is null, you will receive a null
in the response. If you want to receive the last non-null element,
set allowNulls to false
.
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["loc_esi_id", "raw_reads"], "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" }, "timeseriesFilter": { "field": "raw_reads", "transform": { "op": "last", "allowNulls": false } } }
- Response
-
{ "results": [ { "loc_esi_id": "4727354321000111", "last": { "tstamp": { "$date": "2011-02-08T05:45:00Z" }, "value": 1.412 } } ], "hasMore": false, "responseTime": 3 }
Example 12: Query for the count of TimeSeries elements
In addition to extracting the first or last element out of a TimeSeries object, the REST API also supports counting the number of elements in a matching TimeSeries object through the transform option.
When using count as the transform operator, you can chose to count all elements or you can get a count of elements that match a specific condition.
To count all elements within a specific TimeSeries object, run a REST
query request with "transform": { "op": "count" }
as shown here.
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["loc_esi_id", "raw_reads"], "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" }, "timeseriesFilter": { "field": "raw_reads", "start": { "$date": "2010-12-01T00:00:00Z" }, "end": { "$date": "2010-12-31T00:00:00Z" }, "transform": { "op": "count" } } }
- Response
-
{ "results": [ { "loc_esi_id": "4727354321000111", "count": 2881 } ], "hasMore": false, "responseTime": 5 }
To count for elements that meet a certain condition, add an expression to
the transform document. For example, "transform": {
"op": "count", "expression": "value > 1" }
to
receive a count of all TimeSeries elements where the value greater
than one. Refer to the documentation on the OneDB
CountIf function for more examples of
expressions that can be passed to this function.
- Request
-
POST http://localhost:8080/api/servers/server1/databases/stores_demo/timeseries/tables/ts_data/query
- Request Body
-
{ "fields": ["loc_esi_id", "raw_reads"], "filter": { "key": "loc_esi_id", "op": "=", "value": "4727354321000111" }, "timeseriesFilter": { "field": "raw_reads", "start": { "$date": "2010-12-01T00:00:00Z" }, "end": { "$date": "2010-12-31T00:00:00Z" }, "transform": { "op": "count", "expression": "value > 1" } } }
- Response
-
{ "results": [ { "loc_esi_id": "4727354321000111", "count": 201 } ], "hasMore": false, "responseTime": 5 }