Recommended workflow
About this task
For high-volume dimensions, the following section provides
a recommended workflow for populating the dimension with a data set
that maintains data integrity while limiting database growth.
Procedure
- If possible, validate the data before creating the dimension.
- For some dimensions, the data is already recorded in
the request.
-
For example, the
TLT_URL
value is automatically inserted by the Discover Reference session agent, which is included and enabled in the default
pipeline configuration. URL normalization is enabled by default, too. See "Discover Reference Session Agent" in the Unica Discover
Configuration Manual.
-
For other high-volume dimensions that extract from request or response data, you may want
to verify that the data is being appropriately captured in a session through replay before
you create the dimension. For example, you can search for specific event values or indexed
request/response data. See "Searching Session Data" in the Unica Discover
Manuals.
- If the values do not appear to be recorded properly:
- Ascertain if they are being inserted by Discover or your web application:
- If the data is being inserted by Discover, verify that the appropriate
component is inserting the data. Data may be inserted by the DNCA, Canister, or event
that is defined in the Event Manager.
- If the data is inserted by your web application, verify the data
with your web development team.
- Create the dimension.
- Make sure to set the Values to Record to be
Whitelist
Only
.
- You may want to adjust the Max Values Per Hour as needed.
- Processed values include whitelisted values, which also count
against this limit. Blacklisted values do not count.
Note: For testing
purposes, you may want to add this dimension to a report group that
is associated with an event that occurs in each session. Later, through
the Discover Report Builder,
you can create a simple report with the event + dimension combination
to review the captured values.
- Enable logging of values for the dimension. Dimension
logging enables the capture of observed values for purposes of downloading
and creating your whitelist. These values are captured in logs that
are stored in the database, which are automatically cleared after
a period of days. See Manage Events - Dimensions Tab.
- Let the log fill with a sufficient volume of values to
be a meaningful cross-section of activity. For a high-volume dimension,
you may have a representative data set by waiting a single hour.
Note: A downloaded log file can contain up to the top 250,000
values by occurrence over the duration that they were collected in
the logs.
- Edit the log values to be your first pass at the whitelist.
- Download the logged values to your local desktop.
- Load the values into Microsoft™ Excel.
Sort them based on the occurrences.
- You can decide the top number of values to insert into
your whitelist. You should copy and paste these values to a separate
XLS sheet.
Note: A whitelist can contain up to 5,000 values.
- Retain the file that you used to upload for recordkeeping.
- Load the values into your whitelist through the Dimension
editor.
- Monitor the captured values.
- After you loaded the dimension values into the whitelist,
all subsequent observed values are checked against the whitelist.
- If the Maximum Number Per Hour of values is exceeded,
an instance of the
[Limit]
value is recorded for
the dimension.
- If an observed value does not appear in the whitelist
and the Max Number Per Hour of values was not exceeded, an instance
of the
[Others]
is recorded for the dimension.
- Through the Discover Report
Builder, create a report:
- Add an event that occurs each session.
- Add the dimension, which should be available if you added it to
a report group associated with the event.
- Each hour, you can track the count of occurrences of the
[Others]
and [Limit]
.
- Periodically, you should download a new set of log values
and compare it to the set that you saved.
- Look for logged values that have a number of occurrences
greater than 1 and that do not appear in the whitelist. These values
should be added.
- Look for values in the whitelist that do not appear
in the set of logged values. These values should be removed.
- In Microsoft™ Excel,
the
VLOOKUP
function can be used to check the contents
of one worksheet against another. For more information, see the documentation
available inside Microsoft™ Excel.
Note: If there are significant changes to your web application,
your dimension whitelists are likely to need rebuilding. Contact your
web application development team for details on the changes.
- When the values appear to stabilize, you can turn off logging
of values.
Have feedback?
Google Analytics is used to store comments and ratings. To provide a comment or rating for a topic, click Accept All Cookies or Allow All in Cookie Preferences in the footer of this page.