Aggregate or slice time series data

You can use the MongoDB aggregation pipeline commands to aggregate time series values or return a slice of a time series.

When you run an aggregation query on a time series table, internally the time series Transpose function converts the aggregated or sliced data to tabular format and then the genBSON function converts the results to BSON format. Therefore, the output of the $group or $project stage in the aggregation pipeline is collection-style JSON data. Any subsequent stages of the aggregation pipeline can process the data as JSON documents.

The aggregate and slice operations return JSON documents that include the primary key columns of the time series table. You can remove the primary key columns with the $project operator in the next stage of the aggregation pipeline.

To run the examples of aggregating and slicing time series data, create a JSON time series by following the instructions for loading hybrid data: Example for JSON data: Create and load a time series with JSON documents.

Aggregate: The $group operator syntax

To aggregate time series values, you use the $group operator and include a $calendar object to define the aggregation period, and include one or more aggregation operator expressions to define the type of operation and the data to aggregate. The data to aggregate must be numeric and able to be cast to float values. The $group operator produces the same results as running the time series AggregateBy function. If you have multiple TimeSeries columns in a table, you can aggregate values with the $group operator for only the first TimeSeries column.


1 {$group:  {  $calendar: {
2.1 %Calendar definition
2.1  name: " calendar_name "
1 }
1 ,  + , %Aggregation operator expression } }
Calendar definition

1  interval: number ,  timeunit: "unit" ,  start: "start_time"?  , end: "end_time"?  , discrete:
2.1 true
2.1 false  }
Aggregation operator expression

1  field_name: {
1  operator:
2.1 "$column.field"
2.1 "$column"
1  $nth: [
2.1 "$column.field"
2.1 "$column"
1 ,
1 position
1 ]
2  }
$calendar
The calendar that defines the aggregation period. You can specify the name of an existing calendar with the following document: {name: "calendar_name"}. The calendar must exist in the CalendarTable table.
You can define a calendar for the aggregation operation with a document that contains the following fields:
interval
The number is a positive integer that represents number of time units in the aggregation period. For example, if the interval is 1 and the time unit is DAY, then the values are aggregated for each day.
timeunit
The unit is the size of the time interval for the aggregation period. Can be SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, or YEAR.
start
The start_date is the start date of the aggregation operation in DATETIME YEAR TO FRACTION(3) format.
end
Optional. The end_date is the end date of the aggregation operation in DATETIME YEAR TO FRACTION(3) format. If you omit the end date, the aggregation operation continues through the latest time series element.
discrete
Optional. Controls whether the data remains as discrete values or is smoothed to be continuous.
true = Default. The data remains discrete.
false = The data is smoothed. You might want to smooth your data if you want to treat your data as continuous, for example, temperature data. Smoothing data can accurately compensate for missing data. You can only use the $avg, $min, and $max aggregation operators on smoothed data. You cannot use the $sum, $median, $first, $last, or $nth aggregation operators on smoothed data.
For example, the following calendar definition produces an aggregate value per day for a month:
{ $calendar: { interval: 1,
               timeunit: "DAY",
               start: "2015-07-03 15:40:03.000",
               end: "2015-08-03 15:40:03.000",
               discrete: true }
Aggregation operator expression
The field_name is a descriptive name for the results of the aggregation operation.
The operator can be $sum, $avg, $min, $max, $median, $first, $last, or $nth. The $nth operator requires a position value.
The column is the name of the column to aggregate in the TimeSeries row type. If the column contains BSON data, include a dot followed by the field name to aggregate within the BSON documents. For example, if the column name is sensor_data and the field name is value, the column name is specified as "$sensor_data.value".
The position is an integer that follows the $nth operator to represent the position of the value to return within the aggregation period. Positive integers begin at the first value. A position of 1 is the same as using the $first operator. Negative integers begin at the latest value. A position of -1 is the same as using the $last operator.

Example: Daily average value

The following example returns the daily average of a value over the period of three days for the v1 field in the sensor_data column in the tstable_j table for the sensor 1:

db.tstable_j.aggregate(
      {$match: {id: 1 } },
      {$group: { $calendar: { interval: 1,
                 timeunit: "DAY",
                 start: "2014-03-01 00:00:00.000",
                 end: "2014-03-03 23:59:59.000",
                 discrete: true },
               val_AVG: {$avg: "$sensor_data.v1"} } }
)

{
        "result" : [
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-01T00:00:00Z"),
                        "val_avg" : 1.416666666666667
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-02T00:00:00Z"),
                        "val_avg" : 1.4437500000000003
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-03T00:00:00Z"),
                        "val_avg" : 1.4447916666666671
                }
        ],
        "ok" : 1
}

Example: Get the maximum value for each month

The following example returns the maximum value for each month over a six-month period for the v2 field in the sensor_data column in the tstable_j table for the sensor 1:

db.tstable_j.aggregate(
      {$match: {id: 1 } },
      {$group: { $calendar: { interval: 1,
                 timeunit: "MONTH",
                 start: "2014-01-01 00:00:00.000",
                 end: "2014-6-30 23:59:59.000",
                 discrete: true },
               maximum: {$max: "$sensor_data.v2"} } }
)
{
        "result" : [
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-01-01T00:00:00Z"),
                        "maximum" : 22.9
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-02-01T00:00:00Z"),
                        "maximum" : 23.4
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-01T00:00:00Z"),
                        "maximum" : 23.1
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-04-01T00:00:00Z"),
                        "maximum" : 22.9
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-05-01T00:00:00Z"),
                        "maximum" : 24.0
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-06-01T00:00:00Z"),
                        "maximum" : 24.8
                }
        ],
        "ok" : 1
}

Slice: The $slice operator syntax

To slice a time series, you use the $project operator to identify the time series and include a document with a $slice operator to specify the time range of the time series elements to return. The $slice operator produces the same results as running the time series Clip or ClipCount functions.


1  { $project: { time_series: { $slice:
1 N
1  [ N , flag ]
1  [ tstamp , N?  , flag ]
1  [ begin_tstamp , end_tstamp?  , flag ]
2  } } }
$project
The time_series is the name of the time series column.
$slice
The N is an integer that represents the number of elements to return. Positive values return elements from the beginning of the time series or starting at the specified time stamp. Negative values return elements from the end of the time series or ending with the specified time stamp.
The tstamp is a DATETIME value that represents the start or end time stamp of the elements to return.
The begin_tstamp is the beginning time stamp of the elements to return.
The end_tstamp is the ending time stamp of the elements to return.
The flag controls the configuration of the resulting time series. For values, see the Clip function.

Example: Get the next five elements

The following example returns the first five elements, beginning at March 14, 2014, at 9:30 AM, from the tstable_j table for the sensor with the ID of 1:

db.tstable_j.aggregate(
    { $match: { id: 1}},
    { $project: { sensor_data: { $slice: ["2014-03-14 09:30:00.000", 5] }
} }
)

{
        "result" : [
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-14T09:30:00Z"),
                        "v1" : 1.7,
                        "v2" : 20.9
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-14T09:45:00Z"),
                        "v1" : 1.6,
                        "v2" : 17.4
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-14T10:00:00Z"),
                        "v1" : 1.6,
                        "v2" : 20.3
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-14T10:15:00Z"),
                        "v1" : 1.8,
                        "v2" : 20.4
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-14T10:30:00Z"),
                        "v1" : 1.3,
                        "v2" : 17.1
                }
        ],
        "ok" : 1
}

Example: Get the previous three elements

The following example returns the previous three elements, ending at March 14, 2014, at 9:30 AM, from the tstable_j table for the sensor with the ID of 1:

db.tstable_j.aggregate(
    { $match: { id: 1}},
    { $project: { sensor_data: { $slice: ["2014-03-14 09:30:00.000", -3] }
} }
)
{
        "result" : [
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-14T09:00:00Z"),
                        "v1" : 1,
                        "v2" : 22.8
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-14T09:15:00Z"),
                        "v1" : 1.8,
                        "v2" : 21.6
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-14T09:30:00Z"),
                        "v1" : 1.7,
                        "v2" : 20.9
                }
        ],
        "ok" : 1
}

Example: Get elements in a range

The following example returns the elements between March 14, 2014, at 9:30 AM and March 14, 2014, at 10:30 AM, from the tstable_j table for the sensor with ID 1:

db.tstable_j.aggregate(
  { $match: { id: 1 }},
  { $project: { sensor_data: { $slice: ["2014-03-14 09:30:00.000", 
                                        "2014-03-14 10:30:00.000"] } } } 
)

{
        "result" : [
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-14T09:30:00Z"),
                        "v1" : 1.7,
                        "v2" : 20.9
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-14T09:45:00Z"),
                        "v1" : 1.6,
                        "v2" : 17.4
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-14T10:00:00Z"),
                        "v1" : 1.6,
                        "v2" : 20.3
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-14T10:15:00Z"),
                        "v1" : 1.8,
                        "v2" : 20.4
                },
                {
                        "id" : "1",
                        "tstamp" : ISODate("2014-03-14T10:30:00Z"),
                        "v1" : 1.3,
                        "v2" : 17.1
                }
        ],
        "ok" : 1
}