Data masking concepts, methods, and types

Use Test Integrations and APIs to add a data mask to particular fields in recorded messages that may contain sensitive data. When you add a mask, you prevent that sensitive data from appearing in any assets created from those messages, such as tests, test data, stubs, and data models.

Data masking in Test Integrations and APIs is a form of data substitution. You replace data in specified fields with other non-sensitive versions of that data. Replacement values can be generated in a number of ways and Test Integrations and APIs will ensure that these replacement values are used consistently within your project.

Consider the example of an airline booking system that passes credit card numbers among different services. If you are using dummy data to test the booking system, visibility of certain data may not matter. However, if you are creating a test data set from data recording in a production system, the inclusion of certain real data, such as credit card numbers, could be a problem.

Note: Data masking applies only when recording data in the Recording Studio perspective. It does not apply in other parts of the Test Integrations and APIs user interface where the affected fields may be displayed, for example, the Test Factory, Test Lab, and Results Gallery perspectives, and the Message Differences window. Further, if you set up a schema specific data mask and you run a test that receives data using that schema, or if you set up watch mode testing to receive data using that schema, Test Integrations and APIs will not mask that data.

Data masking concepts

The following table outlines key data masking concepts in Test Integrations and APIs.

Concept Description

Integrity

Integrity provides you with the capability to replace data in a masked field consistently across multiple messages and message types within a Test Integrations and APIs user interface session.

Consider the following scenario:

  • You have a series of messages passing among four systems. The messages represent a chain but, for testing purposes, you need a specific credit card number used in all four systems to be the same across all of them.
  • If masking replaces the same credit card number in the four different message types with four different credit card numbers, the integrity of the message flow is broken.
  • To resolve this problem in your Test Integrations and APIs project, you can take the following two steps:
    1. Specify that the "credit card" field in each message type is to be masked.
    2. Assign the same "label" to each of them.

Integrity map

When the Recording Studio perspective is used with data masks, the value in a message and a corresponding label is examined to determine whether this combination has been masked already:

  • If a specific value/label combination has been masked already, the previous masked value is resued.
  • If a specific message/label combination has not been masked, a masked value is generated and stored against this value/label so that it can be retrieved later. This is the "integrity map". The integrity map is held in memory and is valid only for a single Test Integrations and APIs user interface session.

Data masking methods

Test Integrations and APIs supports the following methods of replacing data in fields.

Method Description

Fixed value substitution

This is a simple and straightforward method where you enter a value to mask the "real value" of a field. The value will replace any data recorded for that field.

Data source substitution

This method uses a reference to a data source, such as a Microsoft Excel spreadsheet or a .csv file, to supply a list of values that can be used to replace recorded data.

This method can be configured to allow substitutions in either of the following two ways:
  • Random substitutions from the test data set each time that the substitution is performed
  • The same substitution for a given value each time that the substitution is performed
The default behavior of this method is as follows:
  • Before reaching the "end of data" and to maintain integrity, any duplicate values in a test data set are not used. Also, each value in the data set is examined to verify whether it has already been read by Test Integrations and APIs.
  • If a duplicate value is identified, a different data mask value is used.
  • However, after an end of data is reached and the data set has looped, Test Integrations and APIs reverts to taking the next value because the looping has broken the integrity.

Automatic value creation

This method involves creating a regular expression (Regex) or using an existing one, which Test Integrations and APIs uses to create new values that match that expression.

For example, to replace a credit card number, you can specify that a Regex generates a new 16-digit number.

As with data source substitution, this method can be set up to maintain data integrity by ensuring that the same original values are always replaced with the same substitutions every time.

Data mask types

In Test Integrations and APIs, a data mask can be either schema specific or non-schema specific:
  • A schema specific data masks applies when recording messages that are known to be from a particular schema.
  • In contrast, creating a non-schema specific data mask involves defining a rule or path that is matched across all message elements.

The following table outlines which perspectives and views in Test Integrations and APIs are used to create, view, modify, and delete schema specific and non-schema specific data masks.

Perspective, view Schema specific Non-schema specific

Architecture School, Schema Library

Create, view, modify, delete

(Not applicable)

Architecture School, Rule Cache

(Not applicable)

View, modify, delete

Recording Studio

Create, view, delete

Create, view, delete

Note: If Test Integrations and APIs uses permissions to control which users can create, modify, or delete data masks, you may need to request for these permissions.