Delphix will mask a number of different file types and formats. These include fixed-width, delimited, Excel, Mainframe/VSAM, and XML. The purpose of this document is to provide an overview of general guidelines on how to successfully mask files using Delphix.
Delphix supports 2 masking methodologies, In-Place and On-The-Fly. In-Place requires a single file connection and Delphix will read from that file, mask data in memory, and update the file with the masked data. On-The-Fly requires 2 file connections. One connection for the source file, and one connection to the target where the masked file will be placed. The target file path specified in the connector of the target rule set must point to an existing file. In this scenario, Delphix will read the file from the source connection, mask in memory, and write the masked data to the target file.
Unlike databases files for the most part do not have built in metadata to describe the format of the fields in the file. You must provide this to Delphix so it can update the file appropriately. This is done through the settings tab where you will see a menu item on the left for File Format. Select File Format and you will see options to create a file format or input a file format. This will depend on the type of file and how you want to let Delphix know the format of the file.
Mainframe and XML Files
For Mainframe/VSAM files, you can specify the file format via Input Format option which will import the copybook directly into Delphix. You can input this file from SFTP or FTP. Please select Copybook as the Import Format Type.
For XML files you can also input the file format with the input format option. You can use the file you want to mask as the format. Delphix will input the format of the file directly. You can input this file from SFTP or FTP. Please select XML as the Import Format Type.
Delimited, Excel, Fixed Files
For Delimited, Excel, and Fixed files you can either manually create the format of the file yourself, or you can input a text file which describes the structure of the file to Delphix. To input the file format for delimited or Excel files create a text document with the column names each on its own line. For example:
To input the file format for fixed files create a text document with the column names and the length of each column on its own line. For example:
Then input this file as the file format. The name of the text file will be the name of the file format.
To create a format manually, you can just click the create format button and give the format a name. We will input the details of the format a little later in this document.
Create a File Connector
In your environment you will need to create a connector to access the files. Each file type will require a separate connector. When you create the connector, scroll down to the particular file type you desire and select it. After you select this you will need to select the connection mode. This can be either SFTP or FTP, depending on which protocol you'd like to use to connect to the server where the files are stored.
These connection modes will require additional information. We provide a test connection button to test the validity of the connection.
If you are doing in-place masking the file(s) will be masked and updated in the directory pointed to by the connector. If you plan to do on-the-fly masking then you will need to create a separate environment and connector to be the source for the files to be masked. The masked files will get put into the directory pointed to by the connector you created previously (the target). However, the file path specified in the connector of the target rule set must point to an existing file the target directory. It does not have to be a copy of the file, just an entry in the directory with the same name. It will be replaced by the masked file.
Create a Ruleset
Once you create a connector, you can click on the ruleset tab and create a ruleset. Click on create ruleset, give it a name, and provide the file connector you previously created. Once you do this you will see a list of files that the connector points to. You can select a single file, multiple files, or all the files. Once you click save the ruleset with the file or files will be saved.
Once you create a ruleset with a file or set of files, you will need to assign those files to their appropriate file format. This is accomplished by editing the ruleset. When you click on the edit button for the file a popup screen called edit file will appear with the file name. There will be a dropdown for the format so you can select the proper format for the file. If the file is a Mainframe/VSAM file with a copybook you will see a checkbox to signify if the file is variable length. For all other file types, select the end-of-record to let Delphix know whether the file is in windows/dos format (CR+LF) or Linux format (LF). If the file is a delimited file you will have a space to put in the delimiter. If there are multiple files in the ruleset you will have to edit each one individually and assign it to the appropriate file format.
For Excel, Delimited, or Fixed files, if you created the file format by importing it, then the format for the file will be set. When you go to the inventory page and select the rule set and file, you will see a line showing all the records which you can expand to see the inventory. If your file has a header and/or footer, you will need to click Record Type, click Add Record Type, and select Header and/or Footer from the drop-down menu. Then enter a name for this and the number of rows or lines. Now you can assign the appropriate masking algorithms either by running the profiler or setting them manually on the all records section.
If you did not import a format and just created a file format with the Create Format button, you must enter the actual layout of the file into Delphix. You can do this for Excel, Delimited, and Fixed files.
- Navigate to the inventory screen.
- Select the appropriate rule set and file.
- Click Record Type and add the appropriate record types.
- Select Body from the drop-down menu and give it a name, such as "Body."
- If you the columns are defined in your file, you can import the file using the Import button. If not, then just save.
- If your file has a header or footer, you can add those next.
- If you did not import the format, you must enter it manually.
- Click Define Fields.
- When the screen pops up, enter the field name.
- Select the record type (body) and position in the record.
- If the file is a fixed-length file, you must also enter the length of the field.
- Optional: Set the masking here. Alternatively, you can set the masking with the profiler.
For XML or VSAM files, once you select the ruleset and the file, you will see the inventory for the file, and you can edit this inventory with the appropriate masking settings like any Delphix data source by either using the profiler or setting this manually.
For VSAM files, the inventory also allows for the entry of Redefine Conditions, which are used to handle any occurrences of COBOL's REDEFINES construct that might appear in the Copybook. In COBOL, the REDEFINES keyword allows an area of a record to be interpreted in multiple different ways. In the example below, for instance, each record can hold either the details of a person (PERSON-DET) or the details of a company (COMP-DET).
01 CS-CUSTOMER-RECORD. 05 CUST-TYPE PIC X(1). 05 PERSON-DET. 10 PERSON-FIRSTNAME PIC X(20). 10 PERSON-LASTNAME PIC X(40). 10 PERSON-ADDRESS1 PIC X(50). 10 PERSON-CITY PIC X(20). 10 PERSON-STATE PIC X(5). 10 PERSON-ZIP PIC X(10). 10 PERSON-SSN PIC S9(9) COMP-3. 05 COMP-DET REDEFINES PERSON-DET. 10 COMP-ENTITYNM PIC X(53). 10 COMP-ADDRESS1 PIC X(50). 10 COMP-CITY PIC X(20). 10 COMP-STATE PIC X(5). 10 COMP-ZIP PIC X(10). 10 COMP-PHONE PIC X(12).
Depending on which group is present, different masking algorithms may need to be applied. Below is the inventory corresponding to this copybook, which allows algorithms to be selected separately for each group.
In order to do any masking however, the masking engine must be able to determine, for each record, which fields should be read, so that the correct algorithms can be applied. In order to do this, the masking engine uses Redefine Conditions, which are specified in the inventory. Redefine Conditions are boolean expressions which can reference any fields in the record when they are evaluated.
In the example copybook above, the field CUST-TYPE is used to indicate which group is present. If CUST-TYPE holds a 'P', a PERSON-DET group is present, and if it holds a 'C', COMP-DET is present. This can be expressed in the inventory by specifying a Redefine Condition with the value
[CUST-TYPE]='P' . This expression indicates that, for each record read from the source file during the masking job, the value of the field CUST-TYPE should be read and compared against the string 'P'. If it is equal, the masking engine will read from the record the fields subordinate to PERSON-DET, and will apply any masking algorithms specified on those fields. Similarly, a Redefine Condition with the value
[CUST-TYPE]='C' should be applied to the COMP-DET field.
Exactly one of the conditions should evaluate to 'true' for each group of redefined fields. For example, a copybook might have fields A, B REDEFINES A, and C REDEFINES A. Of the Redefine Conditions attached to A, B, and C, one and only one should evaluate to true for each record.
Entering a Redefine Condition
- Click on the orange REDEFINED or REDEF button next to the redefined or redefining field
- Enter a condition in the dialog box which appears. This is the expression, which, when it evaluates to true, causes the subordinate fields to be read and, if they have algorithms assigned, masked.
- Click 'Submit'
Format of Redefine Conditions
Redefine Conditions allow fields to be compared against either number or string literals. Square brackets enclosing a field name indicate a variable, which takes on the value of the named field:
[Field1] = 'An example String'
String literals can be enclosed in either single or double quotes. For fields that are numeric (e.g. PIC S99V9), the operators
>= can be used in addition to the
= operator, e.g.
[Field2] <= -10.5
Also, conditions can be joined using AND, OR, and NOT to form more complex conditions:
([Field3] > 2.5 AND [Field3] < 10) OR NOT [FIELD4] = 'Z'