Introduction and Use Cases

Your organization may have more than one masking engine, and in certain circumstances it may want to coordinate the operation of those engines. There are two specific scenarios in which an organization will want to have some level of interaction and orchestration between multiple masking engines:
  • Software Development Life Cycle (SDLC)

  • Distributed Execution

For both of these use cases, you will need to be able to move various objects between masking engines. These objects may include the following:

  • Algorithms
  • Masking Jobs
  • Connectors
  • Rulesets
  • Inventory
  • Domains
  • Profile Expressions
  • Profile Sets

You can move a subset of these objects between engines using the Masking V5 APIs.

See the following sections for instructions.

Software Development Life Cycle (SDLC)

Using an SDLC process often requires setting up multiple masking engines, each for a different part of the cycle (Development, QA, Production).

Distributed Execution

For many organizations, the size of the profiling and masking workloads requires more than one production masking engine or the data sources are in different administrative/geographic areas making multiple engines more appropriate. These masking engines can be identical in configuration or be partially equivalent depending on the organization's needs.

Concepts

Syncable Object

Syncable objects are external representations of objects within the Delphix Masking Engine that can be exported from one engine and imported into another. The Delphix Masking Engine supports the concept of EngineSync, which is the ability to coordinate the use of masking algorithms across multiple Delphix Engines. EngineSync currently supports exporting a subset of algorithms and the encryption key as syncable objects.

Dependencies

Most objects within the Delphix Masking Engine are compositional. In order to properly capture the behavior of a syncable object, you must export its dependencies along with the object itself. For example, when exporting a lookup algorithm, you must also export the encryption key to capture the behavior of the algorithm properly.

Object Identifiers and Types

EngineSync uses object identifiers to name unique objects within the engine. The follow object types are currently supported:

  • KEY
  • Certain algorithms:
    • BINARYLOOKUP - Refers to Binary Lookup Algorithms
    • LOOKUP -  Secure lookup is the most commonly used type of algorithm.  When this algorithm replaces real, sensitive data with fictional data, it is possible that it will create repeating data patterns, known as “collisions.” For example, the names “Tom” and “Peter” could both be masked as “Matt.” Because names and addresses naturally recur in real data, this mimics an actual data set.
    • SEGMENT -  Segment mapping algorithms produce no overlaps or repetitions in the masked data. They let you create unique masked values by dividing a target value into separate segments and masking each segment individually
    • TOKENIZATION -  A tokenization algorithm is the only type of algorithm that allows you to reverse its masking. For example, you can use a tokenization algorithm to mask data before you send it to an external vendor for analysis.
    • MAPPLET - Refers to Custom algorithms. Custom algorithms (mapplets) are syncable between masking engines if they are self-contained in the mapplet implementation file.

Object Revision Tracking

Every syncable object in the Delphix Masking Engine has a revision hash associated with it. The revision hash is provided in calls to GET /syncable-objects and POST /export endpoints. The revision hash field provides a way to detect when an object has changed and thus, in some workflows, the object might need to be resynchronized. The revision hash of an object will change whenever the object itself changes, or whenever one of its dependent objects changes.

For example, if the object is a secure lookup algorithm, the revision hash will change if the algorithm itself is updated. It will also change if the encryption key, which the algorithm depends on, is changed.

Revision hashes are generated randomly when objects are created and are preserved when an object is exported from one engine and imported to another. Therefore if two objects have the same revision hash, the objects are identical, however it is possible that two independently created objects could have different revision hashes but otherwise be identical.

For example, it is possible to manually create secure lookup algorithms on two engines that behave identically (assuming the engines already have the same encryption key) but since they were created independently and not synchronized, their revision hashes may be different.

Export Document

You can export one or more syncable objects that are listed in the /syncable-objects endpoint as a binary object called an export document. The export document will include the set of objects that you requested for export and all dependencies that are required to properly import those objects into another engine.

The export document is exported as an opaque blob. Do not edit it outside of the Delphix Masking Engine.

Export Document Encryption

You can request that the export document be encrypted using a passphrase. Once the document is encrypted with the passphrase, the engine forgets the passphrase. You will need to provide the same passphrase during import to decrypt the document.

Digital Signature

In order to detect accidental or malicious modification of the export document, each document is digitally signed. If the export document does not match its expected digital signature, a Masking Engine will not import the document.

Endpoints

For more detailed API documentation, please refer to the Masking API Cookbook.

GET /syncable-objects

This endpoint lists all objects in an engine that are syncable and can be exported. Any object which can be exported, can be imported into another engine. The endpoint has no parameters. Each object is listed with its revision hash.

POST /export

This endpoint allows you to export one or more objects in batch fashion. This endpoint returns an export document and a set of metadata that describes what was exported. You are expected to specify which objects to export by copying their object identifiers from the output of the  /syncable-objects endpoint.

The endpoint has a single optional header, a passphrase. If you provide the passphrase, the export document will be encrypted using it.

Note that while export operations are in progress, import operations and attempts to use the "Generate New Encryption Key" feature will fail. Wait until all export operations have completed to perform those operations.

Error handling

If an error occurs while exporting one or more elements in the export document, the entire export will abort.

POST /import

This endpoint allows you to import a document exported from another engine. The result of import is a list of objects that were imported and whether the import was successful.

The endpoint has one required parameter, force_overwrite, and an optional HTTP header, passphrase, which if provided, the engine will attempt to decrypt the document using the specified passphrase. The required force_overwrite parameter dictates how to deal with conflicting objects.

Note that only one import operation can be in progress at a time. When an import is in progress, both calls to the export endpoint and attempts to use the "Generate New Encryption Key" feature will fail. Wait until the import is complete to perform those operations.

Import Logic Flow Diagram


Error Handling

Export documents often have multiple objects to be imported at once. For example, when exporting a lookup algorithm, you will export both the algorithm and encryption key since lookup algorithms depend on the Masking Engine’s encryption key.

The engine will import one object at time. If there is an error importing an object, the import process will abort. However, any objects that were imported before the error are left as-is. For example, say you are importing objects A, B, and C. Import successfully imports A. During the import of B, the engine encounters an error. Import will report that A was successfully imported, B failed to import, and C was skipped.

Notes

Specifying force_overwrite=false will always fail to import the encryption key unless the encryption key has been previously synchronized using force_overwrite=true.

Specifying force_overwrite=true will always overwrite the engine’s encryption key with the contents of the encryption key in the export document.

Example User Workflow

The following steps provide an example of how to export one or more objects from Masking Engine A to Masking Engine B.

  1. On Masking Engine A, get the Authorization from the /login API.

    POST http://masking-engine-A:8282/masking/api/login
    
    HEADER
    Content-Type : application/json
    
    BODY (raw)
    {"username": "user123", "password": "pw123" }
    
    EXPECTED RESULT
    { "Authorization": "dc2cff8b-e20d-4e28-8b7e-5d7c4aad0e2a" }
  2. On Masking Engine A, call GET /syncable-objects to get a list of syncable objects.

    GET http://masking-engine-A:8282/masking/api/syncable-objects
    
    HEADER
    Authorization : dc2cff8b-e20d-4e28-8b7e-5d7c4aad0e2a (whatever you get from login)
    Content-Type : application/json
    
    EXPECTED RESULT
    [
        {
            "objectIdentifier": {
                "keyId": "global"
            },
            "objectType": "KEY",
            "revisionHash": "68eaffef400e426520a5fcbb683419db3be53317"
        },
        {
            "objectIdentifier": {
                "algorithmName": "AccNoLookup"
            },
            "objectType": "LOOKUP",
            "revisionHash": "485343f1a68698497946f4f70d1cfdd76d516fd8"
        },
        {
            "objectIdentifier": {
                "algorithmName": "AddrLine2Lookup"
            },
            "objectType": "LOOKUP",
            "revisionHash": "f397c46a97bddacf4203e35d7a538fda4bba6b12"
        }
    …
    ]
  3. On Masking Engine A, call /export on objects you want to export.

    POST http://masking-engine-A:8282/masking/api/export
    
    HEADER
    Authorization : dc2cff8b-e20d-4e28-8b7e-5d7c4aad0e2a (whatever you get from login)
    Content-Type : application/json
    passphrase (Optional): password to encrypt the export document
    
    BODY
    [
       {
          "objectIdentifier": {
            "algorithmName": "msuh_test_demo"
          },
          "objectType": "LOOKUP",
          "revisionHash": "68eaffef400e426520a5fcbb683419db3be53317" (Optional)
        }
    ]
    
    EXPECTED RESULT
    {
        "exportResponseMetadata": {
            "exportHost": "masking-engine-A:8282",
            "exportDate": "Tue Jun 13 14:58:25 UTC 2017",
            "exportedObjectList": [
         {
                    "objectIdentifier": {
                        "keyId": "global"
                    },
                    "objectType": "KEY",
     "revisionHash": "68eaffef400e426520a5fcbb683419db3be53317"
                },
                {
                    "objectIdentifier": {
                        "algorithmName": "msuh_test_demo"
                    },
                    "objectType": "LOOKUP",
     "revisionHash": "f397c46a97bddacf4203e35d7a538fda4bba6b12"
                }
            ]
        },
        "blob": 
    "ChgyMDE3LTA2LTEzVDE0OjU4OjI1LjMwMloawgEKPwgHEjsKJ3R5cGUuZ29vZ2xlYXBpcy5jb20vQWxnb3JpdGhtSWRlbnRpZmllchIQCg5tc3VoX3Rlc3RfZGVtbxp/CiR0eXBlLmdvb2dsZWFwaXMuY29tL1Rva2VuaXphdGlvbkRhdGESVwpKCg5tc3VoX3Rlc3RfZGVtbxIOTVNVSF9URVNUX0RFTU8aCjIwMTctMDYtMTIiDWRlbHBoaXhfYWRtaW4qDWFzZGxrZmphbHNkamYaAzI3NyIBNSoBNxqCAQo7CAkSNwordHlwZS5nb29nbGVhcGlzLmNvbS9FbmNyeXB0aW9uS2V5SWRlbnRpZmllchIICgZnbG9iYWwaQwoldHlwZS5nb29nbGVhcGlzLmNvbS9FbmNyeXB0aW9uS2V5RGF0YRIaChhvd203U2JkazJWdlJkbWJ3Y0p3b2dRPT0="
        "signature": "MCwCFHRaXz98fnhTARQq3/WWa/bZvt/aAhRCgYQjBqkxo9iA9/ohEU5ajNXQEQ==",
        "publicKey": "MIHxMIGoBgcqhkjOOAQBMIGcAkEA/KaCzo4Syrom78z3EQ5SbbB4sF7ey80etKII864WF64B81uRpH5t9jQTxeEu0ImbzRMqzVDZkVG9xD7nN1kuFwIVAJYu3cw2nLqOuyYO5rahJtk0bjjFAkBnhHGyepz0TukaScUUfbGpqvJE8FpDTWSGkx0tFCcbnjUDC3H9c9oXkGmzLik1Yw4cIGI1TQ2iCmxBblC+eUykA0QAAkEA3fAdC2zBB7zpIhPyf1c6na0I1Cp188Gcdaqk8uGZTiOUIh3FgISNlD0ZYRGAH39Uep8+KTkJJU+DB1Vsm23qZA=="
    }

    exportedObjectList returns a list of all exported objects including the dependencies

    Example:

    1. Export: lookupAlg A
      exportedObjectList: Key, lookupAlg A

    2. Export: lookupAlg A, lookupAlg B
      exportedObjectList: Key, lookupAlg A, lookupAlg B
  4. On Masking Engine B, call /import to import the exported objects.

    POST http://masking-engine-B:8282/masking/api/import?force_overwrite=true
    POST http://masking-engine-B:8282/masking/api/import?force_overwrite=false
    
    PARAMETER
    force_overwrite can either be true or false. See the discussion in /import. 
    
    HEADER
    (same as export)
    
    BODY
    (Whatever gets returned from export)
    
    EXPECTED RESULT
    [
        {
            "objectIdentifier": {
                "keyId": "global"
            },
            "objectType": "KEY",
            "importStatus": "SUCCESS"
        },
        {
            "objectIdentifier": {
                "algorithmName": "msuh_test_demo"
            },
            "objectType": "LOOKUP",
            "importStatus": "SUCCESS"
        }
    ]

Attempting to Import an Existing Object

During the import of an object, the Delphix Masking Engine checks for the existence of the same object contents. If the engine and the document being imported contain the same content, a result of SUCCESS will be returned without repeating the work of a full import. For example, re-encrypting an engine can be very time consuming, and this should not be repeated if the encryption keys already match. If the object content matches and the Delphix Masking Engine skips the full import, it will be noted in the application log.

Below is an example log statement when an identical encryption key was imported:

2017-07-19 10:17:06,075  [http-nio-8282-exec-4] INFO  c.d.s.marshallers.SyncableMarshaller - Skipping import process for {
  "objectType": "KEY",
  "id": {
    "@type": "type.googleapis.com/EncryptionKeyIdentifier",
    "id": "global"
  }
}, due to no discrepancy between the existing and importing object

When Encryption Keys Change

If the encryption key on Delphix Masking Engine A is regenerated after the algorithm export in Step 3 above, the masking results for Delphix Masking Engine A and Delphix Masking Engine B will differ. To synchronize the results, the export in Step 3 on Delphix Masking Engine A and the import in Step 4 on Delphix Masking Engine B would need to be repeated.

Managing Encryption Keys

One important piece of data used by many masking algorithms is a shared encryption key. This key is used to encrypt data that is stored in the application repository within the masking engine. It is also used by most of  the masking algorithms, such as secure lookup algorithms. Changing the key changes the output of these algorithms. For example, if the FIRST NAME algorithm masked “Joe” to “Matt,” changing the key might cause it to mask “Joe” to “George”. This allows each masking engine to have unique masking algorithm output. A user with Administrator privileges can change the key by clicking the Generate New Key button in the Admin tab.

Other actions are not allowed during the key generation process. Wait for the Generate New Key process to complete and a success dialogue to display in the user interface before performing additional actions on the Delphix Masking Engine such as running a masking job.

Delphix Masking Engine Admin Tab

Synchronizing Keys between Multiple Engines

In order for an algorithm to behave the same way across several engines, all of those engines must have the same key. Changing an engine’s key alters the behavior of all of the algorithms on that engine that use the key.

When an algorithm that requires the encryption key is exported, the export payload will include the key from that engine. When that payload is then imported into another engine, if the importing engine is not already using the key in the payload, its key will be changed. Therefore, it is best to synchronize the keys of all engines at deployment time.

You may want to change the key from time to time as a security management practice. If so, change it on all of the engines at the same time. That is, generate a new key on one engine, export that key, and import it to all of the other engines in the deployment.

Keys can be imported and exported independently of algorithms. To export the key from an engine, login to the engine through the login endpoint and then call export with the body shown below. Like all objects, you can encrypt the payload by supplying a passphrase header.

[
 {
   "objectIdentifier": {
     "keyId": "global"
   },
   "objectType": "KEY"
 }
]

The API will return a JSON payload containing an encoded form of the key that you can install on other engines through the import endpoint. Like all exported objects, it is encoded in an opaque blob.

Best Practice Guide and Example Architectures for Synchronizing

Algorithm synchronization provides a general and flexible way to move masking algorithms between engines. It is recommended that algorithms move in only one direction. That is, algorithms should be exported from one engine and imported into others but should not go in the other direction. This recommendation is primarily to simplify management of the engines and keeping track of what algorithms exist on which engines.

For the reasons described in the Key Management section above, the first step to deploying any multi-engine configuration should be to synchronize the key among all of the engines involved. This reduces concerns about unexpected key changes causing algorithm masking results to change.

Two example architectures are described below. Note that the two architectures could be combined by having multiple production engines instead of a single one.

Horizontal Scale

The first architecture aims to address the problem of horizontal scale -- that is, achieving consistent masking across a large data estate by deploying multiple masking engines. In this architecture, algorithms are authored on one engine, labeled “Control Masking Engine” in the diagram below. Those algorithms are then distributed to “Compute Masking Engines” using the algorithm synchronization APIs. The synchronized algorithms will produce the same masked output on all of the engines, thus enabling large data estates to be masked consistently.

Horizontal Scale Use Case Diagram

SDLC

The second architecture addresses the desire to author algorithms on one engine, to test and certify them on another, and finally to deploy them to a production engine. Here, algorithms are authored on the first engine, labeled “Dev Engine” in the diagram below. When the developer is satisfied, the algorithms are exported from the Dev Engine and imported to the QA Engine where they can be tested and certified. Finally, they are exported from the QA engine and imported to the production engine. As seen in the diagram below, to ensure that your masked results in Production are secure it is advised that the Production engine uses a different key than the Dev or QA engines. 

SDLC Use Case Diagram

To maintain consistent masking results in the SDLC use case and continue producing the same masked output on a production engine:

  1. Export the production engine's encryption key.
  2. Import the newly developed algorithm from non-production. This will force a time-consuming rekey of the production engine.
  3. Import the production engine's encryption key from step #1. This will force another time-consuming rekey of the production engine.

Algorithm Syncability

The following tables specify which algorithms are syncable between masking engines (in addition to the masking engine key).

Only users with masking admin privilege are able to export and import algorithms.

User-Defined Algorithms


Type

Syncable

ALternative

Lookup

Yes

N/A

Binary Lookup

Yes

N/A

Segment Mapping

Yes

N/A

Mapping

No

None

Tokenization

Yes

N/A

Minmax

No

Enter the same min, max and replacement values

Cleansing

No

Upload the same cleansing rules file

Free Text Redaction

No

Enter the same redaction rules

Custom Algorithm/Mapplet

Yes

N/A

Built-In Algorithms

Algorithm API Name

Algorithm UI Name

Type

Syncable

Alternative

AccNoLookup

ACCOUNT SL

lookup

Yes

NA

AccountTK

ACCOUNT_TK

tokenization

Yes

NA

AddrLine2Lookup

ADDRESS LINE 2 SL

lookup

Yes

NA

AddrLookup

ADDRESS LINE SL

lookup

Yes

NA

BusinessLegalEntityLookup

BUSINESS LEGAL ENTITY SL

lookup

Yes

NA

CommentLookup

COMMENT SL

lookup

Yes

NA

CreditCard

CREDIT CARD

calculated

No

None

DateShiftDiscrete

DATE SHIFT(DISCRETE)

calculated

No

Sync the EncryptionKey

DateShiftFixed

DATE SHIFT(FIXED)

calculated

No

Nothing to synchronize

DateShiftVariable

DATE SHIFT(VARIABLE)

calculated

No

None

DrivingLicenseNoLookup

DR LICENSE SL

lookup

Yes

NA

DummyHospitalNameLookup

DUMMY_HOSPITAL_NAME_SL

lookup

Yes

NA

EmailLookup

EMAIL SL

lookup

Yes

NA

FirstNameLookup

FIRST NAME SL

lookup

Yes

NA

FullNMLookup

FULL_NM_SL

lookup

Yes

NA

LastNameLookup

LAST NAME SL

lookup

Yes

NA

LastCommaFirstLookup

LAST_COMMA_FIRST_SL

lookup

Yes

NA

NameTK

NAME_TK

tokenization

Yes

NA

NullValueLookup

NULL SL

lookup

Yes

NA

TelephoneNoLookup

PHONE SL

lookup

Yes

NA

RandomValueLookup

RANDOM_VALUE_SL

lookup

Yes

NA

SchoolNameLookup

SCHOOL NAME SL

lookup

Yes

NA

SecureShuffle

SECURE SHUFFLE

calculated

No

None

SsnTK

SSN_TK

tokenization

Yes

NA

USCountiesLookup

US_COUNTIES_SL

lookup

Yes

NA

USCitiesLookup

USCITIES_SL

lookup

Yes

NA

USstatecodesLookup

USSTATE_CODES_SL

lookup

Yes

NA

USstatesLookup

USSTATES_SL

lookup

Yes

NA

WebURLsLookup

WEB_URLS_SL

lookup

Yes

NA

RepeatFirstDigit

ZIP+4

calculated

No

Nothing to synchronize

Custom Algorithms

Custom algorithms (mapplets) are syncable between masking engines if they are self-contained in the mapplet implementation file. Any other dependencies outside the implementation file, including the masking encryption key, will not be exported from one masking engine and imported into another unless you explicitly manage them. You can manage dependencies on the masking engine encryption key by explicitly requesting the export of the encryption key along with the custom algorithm. Other dependencies, such as data on local file systems or databases (including MDS), must be manually copied from one Delphix Masking Engine to another.

Related Links