Compressed numeric time series
If your time series data is recorded at a regular frequency and all the time series values are numeric, you can define a compressed time series to store the data efficiently.
In each element, a compressed time series stores an 11-byte timestamp for the first record and a 2-byte timestamp for each of the other records. The compression ratio of the rest of the time series data varies depending on the type of data and the compression definitions. For example, you can compress an 8-byte BIGINT value down to 1 byte, with some loss of precision.
Time series definition
You define a compressed time series by running the TSCreateIrr function with the compression parameter.
You must include a compression definition for every column in the TimeSeries subtype, except the first timestamp column. The compression definitions are associated with the columns in the same order. If you do not want to compress a particular column, include a compression definition of no compression for that column. If you specify that none of the columns are compressed, only the first timestamp column is compressed.
Besides the first timestamp column, the TimeSeries subtype columns must have only the following data types: SMALLINT, INTEGER, BIGINT, SMALLFLOAT, and FLOAT.
- LVARCHAR
- VARCHAR
- NVARCHAR
- CHAR
- NCHARNote:
The CHAR and NCHAR data types can be treated in two ways: variable length string or fixed length, space padded, string.
- BOOLEAN
- INT8
- DECIMAL
- MONEY
- DATE
- DATETIME
- INTERVAL
- BSON
- Any fixed length UDT (with the same TS restrictions on UDTs as column in a TS subtype)
- Any variable length UDT (with the same TS restrictions on UDTs as column in a TS subtype)
The calendar that you specify in the time series definition defines the size of the interval, however, off periods are not allowed. One record per interval is accepted and the timestamp must be on the interval boundary. For example, if the calendar has an interval of minute, a timestamp that has seconds values other than 00.00000, such as 2013-01-01 01:52:15.00000, is rejected.
Compressed records must be stored in containers. However, a compressed time series cannot be stored in rolling window containers.
Compression types
You compress data with the following types of compression algorithms:
- Quantization
- The quantization compression algorithm divides continuous values into discrete grids. Each grid represents a range of values. Fewer bytes are needed to represent a grid than a numeric value. The quantization algorithm can be lossy. The quantization algorithm allows NULL values.
- Linear
- The linear compression algorithm represents values as line segments, which are defined by two end points. If the values are within the supplied deviation, the values are not recorded. The linear compression algorithm records a value only when a new value deviates too much from the last recorded value. The linear compression algorithm does not allow NULL values.
You can combine compression types and choose the quantization linear boxcar or quantization linear swing door compression algorithm.
You can choose not to compress a column. Columns that are not compressed allow NULL values.
Lossiness
The following equation describes margin of error that is allowed between the original and the compressed values for the different compression types:
Quantization type:
margin of error = (upper_bound - lower_bound)
/(2^(compress_size*8))
Linear types:
margin of error = maximum_deviation
Combination of quantization and linear types:
margin of error = (upper_bound - lower_bound)
/(2^(compress_size*8) + maximum_deviation)
- compress_size
- The size of the compressed data, in bytes.
- lower_bound
- The lowest acceptable value.
- maximum_deviation
- The absolute value of the margin of error.
- upper_bound
- The highest acceptable value.
For example, if the compression definition for quantization
is q(1,1,100)
, the compression size is 1 byte, the
lower boundary is 1, and the upper boundary is 100. The following
equation calculates the margin of error:
(100-1)/256 = 0.387
The maximum difference between the original value and the compressed value is plus or minus 0.387.
For the linear compression type, the margin of error is equal to the maximum deviation value. For example, if the original value is 20 and the maximum deviation value is 0.1, then the compressed value is in the range 19.9 - 20.1.
If the
compression definition for quantization linear boxcar is qlb(1,1,100,100000)
,
the compression size is 1 byte, the maximum deviation is 1, the lower
boundary is 100, and the upper boundary is 100000. The following equation
calculates the margin of error:
(100000-100)/256= 390.235 + 1 = 391.235
The maximum difference between the original value and the compressed value is plus or minus 391.235.