After you create an environment, connection, rule set, and inventory, you mask data.

Masking Overview

To maintain Referential Integrity (RI), Delphix masks each field on itself. This repeatable masking automatically maintains RI (for verbatim matches), even if it is between applications or platforms. For example, if you want to match the values between a parent and children, simply select the same algorithm to mask them. This ensures that referential integrity is maintained within the same database. Furthermore, Delphix creates the integrity across database platforms (between SQL Server and DB2, for example) or across files (tab-delimited files) and relational data (a column in a SQL Server database) — just select the same masking algorithm.

As a practical example, assume you have an SSN column in a Microsoft SQL Server database, an SSN column in a DB2 database, and an SSN field in a tab-delimited file. If the SSN value was 111111111 across the two databases and the file, and you use the same SSN algorithm for all three, the masked value (for example, 801-01-0838) will be the same for all three.

There are two ways to mask data. You can mask data on-the-fly or you can provision it first and then mask it. The following sections explain these two options.

Masking In Place

With in-place masking, production data that already exists in a nonproduction environment is masked, in place.

Advantages/Disadvantages:

The main advantage to in-place masking is when you have provisioned data to a non-production environment that contains some production data. Delphix can mask the data in those existing environments. In-place masking masks only the columns you flag in the inventory, leaving the other columns alone.

The main disadvantage is that production data is copied potentially into a nonproduction environment while the masking takes place, so sensitive data might exist in the nonproduction environment until the masking is complete.

On-the-Fly Masking


With on-the-fly masking, you specify the source of the information to be masked, and where the masked data will be loaded. On-the-fly masking is an Extract Transform Load (ETL) process.

The Delphix Engine extracts the data from a source environment, such as a production copy, gold copy, or disaster recovery copy (only read from a database not an archived file).

The Delphix Engine transforms, or masks, the data in the memory of the application server on which it resides, and then loads the masked data to the target environment. Delphix does not modify the original source data; only the target data changes.

Advantages/Disadvantages:

One advantage to on-the-fly masking is that sensitive production data doesn't get persisted in any nonproduction environment. This method only requires a production source and nonproduction target environment. Because on-the-fly masking uses all insert statements, it typically performs better than in-place masking, which uses updates.

The main disadvantage to on-the-fly masking is that it requires an active connection to a source production environment or copy.

Masking a Primary Key Column

Because primary keys require unique values, you must mask those columns using a Delphix algorithm that can guarantee uniqueness. You apply the same mapping algorithm to both the primary key column and the foreign key column so the values between the columns will match. For information about creating algorithms, see Delphix Administrator's Guide.

Related Links