An integral part of the data masking process is to use algorithms to mask each data element. You specify which algorithm to use on each individual data element (domain) on the Masking's tab. There, you define a unique domain for each element and then associate the classification and algorithm you want to use for each domain. Use the Algorithm settings tab to create or delete algorithms.

Algorithm Settings Tab

The Algorithm tab displays algorithm Names along with Type and Description. This is where you add (or create) new algorithms. The default Masking Engine algorithms and any algorithms you have defined appear on this tab.
  • All algorithm values are stored encrypted. These values are only decrypted during the masking process.

Algorithm Settings Tab

Adding New Masking Engine Algorithms

You might want to create a new algorithm if none of the default Masking Engine algorithms meet your needs.
Masking Engine Algorithm Frameworks give you the ability to quickly and easily define the algorithms you want, directly on the Settings page. Then, you can immediately propagate them. Anyone in your organization who has Masking Engine can then access the info. 

Administrators can update system defined algorithms. User-defined algorithms can be accessed by all users and updated by the owner/user who created the algorithm.


To add an algorithm:

  1. Click Add Algorithm at the top right of the Algorithm tab.

2.  Select an algorithm type.

3. Complete the form to the right (corresponding to your selected algorithm)

4. Click Save.

Secure Lookup Algorithm

A secure lookup algorithm is a proprietary encrypt/hash/modulus algorithm that is repeatable but unbreakable. It lets you assign a realistic value from a list of predefined values. Use a secure lookup algorithm when you do not need unique values.

To add a secure lookup algorithm:

  1. In the top right of the Algorithm tab, click Add Algorithm.
  2. Choose Secure Lookup Algorithm. The Create SL Rule pane appears.
  3. Enter a Rule Name.

    This name MUST be unique.

  4. Enter a Description.
  5. Specify a Lookup File.
    This file is a single list of values. It does not require a header. Make sure there are no spaces or returns at the end of the last line in the file. The following is sample file content:

    Example Lookup File

    Smallville
    Clarkville
    Farmville
    Townville
    Cityname
    Citytown
    Towneaster
  6. When you are finished, click Save.
  7. Before you can use the algorithm (specify it in a profiling or masking job), you must add it to a domain.

Note

The masking engine supports lookup files saved in ASCII or UTF-8 format only. If the lookup file contains foreign alphabet characters, the file must be saved in UTF-8 format with no BOM (Byte Order Marker) for Masking Engine to read the Unicode text correctly. Some applications, e.g. Notepad on Windows, write a BOM (Byte Order Marker) at the beginning of Unicode files which irritates the masking engine and will lead to SQL update or insert errors when trying to run a masking job that applies a Secure Lookup algorithm that has been created based on a UTF-8 file that included a BOM.

Segmented Mapping Algorithm

Segmented mapping algorithms let you create unique asked values by dividing a target value into separate segments and masking each segment individually. Optionally, you can preserve the semantically rich part of a value while providing an unique value for the remainder. This is especially useful for primary keys or columns that need to be unique because they are part of a unique index.

When using segmented mapping algorithms for primary and foreign keys, in order to make sure they match, you must use the same segmented mapping algorithm for each.

Segmented Mapping Example

Perhaps you have an account number for which you need to create a segmented mapping algorithm. You can separate the account number into segments, preserving the first two-character segment, replacing a segment with a specific value, and preserving a hyphen. The following is a sample value for this account number:

NM831026-04

Where:

  • NM is a plan code number that you want to preserve, always a two-character alphanumeric code.
  • 831026 is the uniquely identifiable account number. To ensure that you do not inadvertently create actual account numbers, you can replace the first two digits with a sequence that never appears in your account numbers in that location. (For example, you can replace the first two digits with 98 because 98 is never used as the first two digits of an account number.) To do that, you want to split these six digits into two segments.
  • -04 is a location code. You want to preserve the hyphen and you can replace the two digits with a number within a range (in this case, a range of 1 to 77).

Procedure for Defining Segments

  1. Choose 3 for No. of Segment. Remember, you do NOT count the segment(s) you want to preserve.
  2. Preserve the first two characters ("NM" in the sample value). Under Preserve Original Values:
    1. For Starting position, enter 1.
    2. For Length, enter 2.
  3. Define the next two-digit segment ("83" in sample value) to always be 98 or 99.
    1. For Segment 1, select Type > Numeric.
    2. For Length, select 2.
    3. For Mask Values Range#, specify 98,99.
  4. Define the next four-digit segment ("1026" in sample value).
    1. For Segment 2, select Type > Numeric.
    2. For Length, select 4.
    3. Leave range fields empty.
    4. Click Add to the right of Preserve Original Values.
  5. Preserve the hyphen.
    1. For Starting position, enter 9.
    2. For Length, enter 1.
  6. Define the last two-digit segment ("04" in sample value).
    1. For Segment 3, select Type > Numeric.
    2. For Length, select 2.
    3. For Mask Values Min#, enter 1.
    4. For Mask Values Max#, enter 77.

The sample value NM831026-04 might be masked to NM981291-77.

Segmented Mapping Procedure

  1. In the upper right-hand region of the Algorithm tab, click Add Algorithm.
  2. Select Segmented Mapping Algorithm. The Segmented Mapping pane appears.
  3. Enter a Rule Name.
  4. Enter a Description.
  5. From the No. of Segment drop-down menu, select how many segments you want to mask.

    This number does NOT include the values you want to preserve.

    The minimum number of segments is 2; the maximum is 9.
    A box appears for each segment.

  6. For each segment, choose the Type of segment from the dropdown: Numeric or Alphanumeric.

    Numeric segments are masked as whole segments. Alphanumeric segments are masked by individual character.

  7. For each segment, select its Length (number of characters) from the drop-down menu. The maximum is 4.
  8. Optionally, for each segment, specify range values. You might need to specify range values to satisfy particular application requirements, for example. See details below.
  9. Preserve Original Values by entering Starting position and length values. (Position starts at 1.) For example, to preserve the second, third, and fourth values, enter Starting position 2 and length 3.
    1. If you need additional value fields, click Add
  10. When you are finished, click Save.
  11. Before you can use the algorithm (specify it in a profiling or masking job), you must add it to a domain. If you are not using the Masking Engine Profiler to create your inventory, you do not need to associate the algorithm with a domain. 

Specifying Range Values

You can specify ranges for Real Values and Mask Values. With Real Values ranges, you can specify all the possible real values to map to the ranges of masked values. Any values NOT listed in the Real Values ranges would then mask to themselves.

Specifying range values is optional. If you need unique values (for example, masking a unique key column), you MUST leave the range values blank. If you plan to certify your data, you must specify range values.

When determining a numeric or alphanumeric range, remember that a narrow range will likely generate duplicate values, which will cause your job to fail.

  1. To ignore specific characters, enter one or more characters in the Ignore Character List box. Separate values with a comma.
  2. To ignore the comma character (,), select the Ignore comma (,) check box.
  3. To ignore control characters, select Add Control Characters.
    The Add Control Characters window appears.
  4. Select the individual control characters that you would like to ignore, or choose Select All or Select None.
  5. When you are finished, click Save.
    You are returned to the Segmented Mapping pane.

Numeric segment type

  • Min# — A number; the first value in the range. Value can be 1 digit or up to the length of the segment. For example, for a 3-digit segment, you can specify 1, 2, or 3 digits. Acceptable characters: 0-9.
  • Max# — A number; the last value in the range. Value should be the same length as the segment. For example, for a 3-digit segment, you should specify 3 digits. Acceptable characters: 0-9.
  • Range# — A range of numbers; separate values in this field with a comma (,). Value should be the same length as the segment. For example, for a 3-digit segment, you should specify 3 digits. Acceptable characters: 0-9.

    If you do not specify a range, the Masking Engine uses the full range. For example, for a 4-digit segment, the Masking Engine uses 0-9999.

Alphanumeric segment type

  • Min# — A number from 0 to 9; the first value in the range.
  • Max# — A number from 0 to 9; the last value in the range.
  • MinChar — A letter from A to Z; the first value in the range.
  • MaxChar — A letter from A to Z; the last value in the range.
  • Range# — A range of alphanumeric characters; separate values in this field with a comma (,). Individual values can be a number from 0 to 9 or an uppercase letter from A to Z. (For example, B,C,J,K,Y,Z or AB,DE.)

    If you do not specify a range, the Masking Engine uses the full range (A-Z, 0-9). If you do not know the format of the input, leave the range fields empty. If you know the format of the input (for example, always alphanumeric followed by numeric), you can enter range values such as A2 and S9.

Mapping Algorithm

A mapping algorithm sequentially maps original data values to masked values that are pre-populated to a lookup table through the Masking Engine user interface. With the mapping algorithm, you must supply AT MINIMUM the same number of values as the number of unique values you are masking, more is acceptable. For example, if there are 10,000 unique values in the column you are masking you must give the mapping algorithm AT LEAST 10,000 values.

To add a mapping algorithm:

  1. In the upper right-hand corner of the Algorithm tab, click Add Algorithm.
  2. Select Mapping Algorithm.
    The Create Mapping Algorithm pane appears.
  3. Enter a Rule Name. This name MUST be unique.
  4. Enter a Description.
  5. Specify a Lookup File (.txt){*}.
    The value file must have NO header. Make sure there are no spaces or returns at the end of the last line in the file. The following is sample file content. Notice that there is no header and only a list of values.

    Smallville
    Clarkville
    Farmville
    Townville
    Cityname
    Citytown
    Towneaster
  6. To ignore specific characters, enter one or more characters in the Ignore Character List box. Separate values with a comma.
  7. To ignore the comma character (,), select the Ignore comma (,) check box.
  8. When you are finished, click Save.

Before you can use the algorithm by specifying it in a profiling or masking job, you must add it to a domain. If you are not using the Masking Engine Profiler to create your inventory, you do not need to associate the algorithm with a domain.

See Adding New Domains.

Binary Lookup Algorithm

A Binary Lookup Algorithm is much like the Secure Lookup Algorithm, but is used when entire files are stored in a specific column. This is useful for masking binary columns – for example, blob, image, varbinary, and so forth.

To add a binary lookup algorithm:

  1. At the top right of the Algorithm tab, click Add Algorithm.
  2. Select Binary Lookup Algorithm.
    The Binary SL Rule pane appears.
  3. Enter a Rule Name.
  4. Enter a Description.
  5. Select a Binary Lookup File on your filesystem.
  6. Click Save.

Tokenization Algorithm

Tokenization uses reversible algorithms so that the data can be returned to its original state. Tokenization is a form of encryption where the actual data (For example, names and addresses) are converted into tokens that have similar properties to the original data – such as text and length – but no longer convey any meaning.

To add a Tokenization algorithm:

  1. Enter algorithm Name.
  2. Enter a Description.
  3. Click Save.

Once you have created an algorithm, you will need to associate it with a domain.

  1. Navigate to the Home>Settings>Domains page and click Add Domain. You will see the popup below:
  2. Enter a domain name.
  3. From the Tokenization Algorithm Name drop-down menu, select your algorithm.

Create a Tokenization Environment

  1. On the home page, click Environments.
  2. Click Add Environment.
  3. For Purpose, select Tokenize/Re-Identify.
  4. Click Save

    This environment will be used to re-identify your data when required.

  5. Set up a Tokenize job using tokenization method. Execute the job.
      


Here is a snapshot of the data before and after Tokenization to give you an idea of what the it will look like.

Before Tokenization

After Tokenization

Min Max Algorithm

The Masking Engine provides a "Min Max Algorithm" to normalize data within a range – for example, 10 to 400. This algorithm allows you to make sure all the values in the database are within a specified range. They prevent unique identification of individuals by characteristics that are outside the normal range – for example, age over 99.

If the Out of range Replacement Values checkbox is selected, a default value is used when the input cannot be evaluated.

  1. Enter the Algorithm Name.
  2. Enter a Description.
  3. Enter Min Value and Max Value.
  4. Click Out of range Replacement Values.
  5. Click Save.

Example: Age less than 18 years - enter Min Value 0 and Max Value 18

Data Cleansing Algorithm

The Masking Engine provides a data-based lookup algorithm. If the target data needs to be put in a standard format prior to masking, these algorithms can be used. For example, "Ariz," "Az," and "Arizona" can all be cleansed to "AZ."

  1. Enter Algorithm Name.
  2. Enter a Description.
  3. Select Lookup File location.
  4. Enter default Delimiter. Key and Value separator is =. You can change this to match the lookup file.
  5. Click Save.

Below is an example of a lookup input file. It does not require a header. Make sure there are no spaces or returns at the end of the last line in the file. The following is sample file content:

Example Lookup File

NYC=NY
NY City=NY
New York=NY
Manhattan=NY

Free Text Algorithm

The Masking Engine can mask free text or comment fields in flat files and database sources.  This algorithm is for masking or redacting free text columns or files.  Masking can be performed on the basis of either a Black List, specifying the words to mask, or a White List, specifying the words to exclude from being masked.  For either option, a list of words can be imported from an external text file or alternatively, Profiler Sets can be used to match words based on regular expressions, defined within Profiler Expressions.  The redaction value that will replace the masked words can also be specified.

Regular expressions defined via Profiler Sets will match individual words within the input text, rather than phrases.

  1. Enter Algorithm Name.
  2. Enter a Description.
  3. Select the Black List or White List radio button.
  4. Select Lookup File and enter Redaction Value OR/AND
    Select Profiler Sets from the drop-down menu and enter Redaction Value.
  5. Click Save.

Free Text Redaction Example

  1. Create Input File.
  2. Create input file using notepad. Enter the following text:
    "The customer Bob Jones is satisfied with the terms of the sales agreement. Please call to confirm at 718-223-7896."
  3. Save file as txt.
  4. Create look up file.
    1. Create a lookup file.
    2. Use notepad to create a txt file and save the file as a TXT. Be sure to hit return after each field. The lookup flat file contains the following data:
      Bob
      Jones 
      Agreement

Create an Algorithm

You will be prompted for the following information:

  1. For Algorithm Name, enter Blacklist_Test1.
  2. For Description, enter Blacklist Test.
  3. Select the Black List radio button.
  4. Select LookUp File.
  5. Enter redaction value XXXX.
  6. Click Save.

Create Rule Set

  1. From the job page go to Rule Set and Click Create Rule Set.
  2. For Rule Set Name, enter Free_ Text_RS.
  3. From the Connector drop-down menu, select Free Text.
  4. Select the Input File by clicking the box next to your input file
  5. Click Save.

Create Masking Job

  1. Use Free_Texr Rule Set
  2. Execute Masking job.

The results of the masking job will show the following:

Redacted Input File: The customer xxxx xxxx is satisfied with the terms of the sales xxxx. Please call to confirm at 718-223-7896.

"Bob," "Jones," and "agreement" are redacted.