TimeSeries solution
Database administrators and applications developers use the TimeSeries solution to store and analyze time series data.
A time series is a set of time-stamped data. Types of time series data vary enormously, for example, electricity usage that is collected from smart meters, stock price and trading volumes, ECG recordings, seismograms, and network performance records. The types of queries performed on time series data typically include a time criteria and often include aggregations of data over a longer period of time. For example, you might want to know which day of the week your customers use the most electricity.
The TimeSeries solution provides the following capabilities to store and analyze time series data:
- Define the structure of the data
- Control when and how often data is accepted:
- Set the frequency for regularly spaced records
- Handle arbitrarily spaced records
- Control data storage:
- Specify where to store data
- Change where data is stored
- Monitor storage usage
- Load data from a file or individually
- Query data:
- Extract values for a time range
- Find null data
- Modify data
- Display data in standard relational format
- Analyze data:
- Perform statistical and arithmetic calculations
- Aggregate data over time
- Make data visible or invisible
- Find the intersection or union of data
The TimeSeries solution stores time series data in a special format within a relational database in a way that takes advantage of the benefits of both non-relational and standard relational implementations of time series data.
The TimeSeries solution is more flexible than non-relational time series implementations because the TimeSeries solution is not specific to any industry, is easily customizable, and can combine time series data with information in relational databases.
The TimeSeries solution loads and queries time stamped data faster, requires less storage space, and provides more analytical capability than a standard relational table implementation. Although relational database management systems can store time series data for standard types by storing one row per time-stamped data entry, performance is poor and storage is inefficient. The TimeSeries solution saves disk space by not storing duplicate information from the columns that do not contain the time-based data. The TimeSeries solution loads and queries time series data quickly because the data is stored on disk in order by time stamp and by source.
For example, the following table shows a relational table that contains time-based information for two sources, or customers, whose identifiers are 1000111 and 1046021.
Customer | Time | Value |
---|---|---|
1000111 | 2011-1-1 00:00:00.00000 | 0.092 |
1000111 | 2011-1-1 00:15:00.00000 | 0.082 |
1000111 | 2011-1-1 00:30:00.00000 | 0.090 |
1000111 | 2011-1-1 00:45:00.00000 | 0.085 |
1046021 | 2011-1-1 00:00:00.00000 | 0.041 |
1046021 | 2011-1-1 00:15:00.00000 | 0.041 |
1046021 | 2011-1-1 00:30:00.00000 | 0.040 |
1046021 | 2011-1-1 00:45:00.00000 | 0.041 |
The following table shows a representation of the same data stored in the TimeSeries table. The information about the customer is stored once. All the time-based information for a customer is stored together in a single row.
Customer | Time | Value |
---|---|---|
1000111 | 2011-1-1 00:00:00.00000 | 0.092 |
2011-1-1 00:15:00.00000 | 0.082 | |
2011-1-1 00:30:00.00000 | 0.090 | |
2011-1-1 00:45:00.00000 | 0.085 | |
1046021 | 2011-1-1 00:00:00.00000 | 0.041 |
2011-1-1 00:15:00.00000 | 0.041 | |
2011-1-1 00:30:00.00000 | 0.040 | |
2011-1-1 00:45:00.00000 | 0.041 |
The following table summarizes the advantages of using the TimeSeries solution for time-based data over using a standard relational table.
Standard relational table issue | TimeSeries table benefit | |
---|---|---|
Storage space | Stores one row for every record. Duplicates the information in non-time series columns. Stores timestamps. Null data takes as much space as actual data. The index typically includes the time stamp column and several other columns. | Significant reduction in disk space needed to store the same data. The index
size on disk is also smaller. Stores all time series data for a single source in the same row. No duplicate information. Calculates instead of stores the time stamp. Null data does not require any space. The index does not include the time stamp column. |
Query speed | Data for a single source can be intermixed on multiple data pages in no particular order. | Queries that use a time criteria require many fewer disk reads and
significantly less I/O. Data is loaded very efficiently. Data for a single source is stored together in time stamp order. |
Query complexity | Queries that aggregate data or apply an expression can be difficult or impossible to perform with SQL. Much of the query logic must be provided by the application. | Less application coding and faster queries. Allows complex SQL queries and analysis. Allows custom analytics written using the TimeSeries API. |