How it works
Periodically, the Data Collector scans the dimension values for
each dimension in the DC_REPORTS
database. If the
number of stored values for any dimension exceeds the globally defined
limit for a dimension, then the Data Collector trims the oldest values
in the database, which is based on the timestamp when the value was
last captured. The oldest values are trimmed until the number of values
for the dimension is less than the specified limit. For example, if
the defined global limit is 1,000,000 values per dimension and Dimension
A contains 1002,000 values, the next trimming that is run by the Data
Collector will remove the 2,000 oldest values.
- The oldest values are determined by the timestamps that are associated
with each value. These timestamps are updated whenever the dimension
values are updated in a separate process.
- Suppose Dimension A is captured before Dimension B, and then Dimension A is captured again. In this case, Dimension B is considered to be older than Dimension A, since the timestamp for Dimension A occurred more recently.
- The time when the dimension values were last updated is available through the Data Collector log in the Portal. See "Portal Logs" in the Unica Discover Administration Manual.
- In the reporting data, all references to the dimension values
above the global limit may be remapped to the
[others]
category as part of the dimension trimming run. This step in the process is resource-intensive. See Updating counts for trimmed dimension values in report data. - Whitelisted values are not removed during a dimension trimming.
Note: Except for calendar-related dimensions, all dimensions that
are visible to Discover users
are analyzed and, if necessary, trimmed, which includes dimensions
that are provided by Discover.
Note: If the number of values that are stored for a dimension reaches
the defined global limit, the number of values is trimmed to the global
limit. However, if new values are detected for the dimension, then
they are stored until the next time the Data Collector trims dimension
values, adding more values above the global limit. In this manner,
a dimension that is trimmed once can be trimmed each time the Data
Collector runs, which further impedes system performance. As more
dimensions reach the global limit, the process to trim them takes
longer and longer. Discover recommends
that any dimension that was trimmed should be converted to a whitelist
if possible.