Masking

The Masking node is used to encrypt a value so that it can't be seen. Masking is an important part of the data flow pipeline for many organizations. Sometimes, due to things like government regulations or privacy concerns, it's necessary to hide protected data in test environments, so that this data is not seen by non-production users like developers, partners, customers, etc. Examples of data that often need to be hidden include names, phone numbers, addresses, credit card details, insurance details, and more.

Pyramid employs deterministic data masking, meaning that if the same value is masked from different columns or tables, it will always be replaced by the same mask value. This means that the masked values can still be analyzed logically, even if the actual value isn't seen. For instance, say you have two columns listing staff names in 2 different tables, called 'Staff' and 'Managers'. Some employees are listed in both these tables; one such employee is 'Jane Smith'. In both tables, Jane Smith is masked as '12345abcde', meaning this employee can be analyzed logically from both tables.

To perform data masking, configure the Masking node on the relevant table; for all masked columns, the original string will be replaced by a random string of letters and numbers.

How to Configure a Masking Node

Connect the Masking node to the Select node representing the relevant table. Go to the Properties panel and from the Masking Node window select the relevant columns form the drop-down list. Each row in the given column(s) will be replaced by a random string.