Introduction and Use Cases
Software Development Life Cycle (SDLC)
Distributed Execution
For both of these use cases, you will need to be able to move various objects between masking engines. These objects may include the following:
- Algorithms
- Masking Jobs
- Connectors
- Rulesets
- Inventory
- Domains
- Profile Expressions
- Profile Sets
You can move a subset of these objects between engines using the Masking V5 APIs.
See the following sections for instructions.
Software Development Life Cycle (SDLC)
Using an SDLC process often requires setting up multiple masking engines, each for a different part of the cycle (Development, QA, Production).
Distributed Execution
For many organizations, the size of the profiling and masking workloads requires more than one production masking engine or the data sources are in different administrative/geographic areas making multiple engines more appropriate. These masking engines can be identical in configuration or be partially equivalent depending on the organization's needs.
Concepts
Syncable Object
Syncable objects are external representations of objects within the Delphix Masking Engine that can be exported from one engine and imported into another. The Delphix Masking Engine supports the concept of EngineSync, which is the ability to coordinate the use of masking algorithms across multiple Delphix Engines. EngineSync currently supports exporting a subset of algorithms and the encryption key as syncable objects.
Dependencies
Most objects within the Delphix Masking Engine are compositional. In order to properly capture the behavior of a syncable object, you must export its dependencies along with the object itself. For example, when exporting a lookup algorithm, you must also export the encryption key to capture the behavior of the algorithm properly.
Object Identifiers and Types
EngineSync uses object identifiers to name unique objects within the engine. The follow object types are currently supported:
- KEY
- Certain algorithms:
- BINARYLOOKUP - Refers to Binary Lookup Algorithms
- LOOKUP - Secure lookup is the most commonly used type of algorithm. When this algorithm replaces real, sensitive data with fictional data, it is possible that it will create repeating data patterns, known as “collisions.” For example, the names “Tom” and “Peter” could both be masked as “Matt.” Because names and addresses naturally recur in real data, this mimics an actual data set.
- SEGMENT - Segment mapping algorithms produce no overlaps or repetitions in the masked data. They let you create unique masked values by dividing a target value into separate segments and masking each segment individually
- TOKENIZATION - A tokenization algorithm is the only type of algorithm that allows you to reverse its masking. For example, you can use a tokenization algorithm to mask data before you send it to an external vendor for analysis.
- MAPPLET - Refers to Custom algorithms. Custom algorithms (mapplets) are syncable between masking engines if they are self-contained in the mapplet implementation file.
Object Revision Tracking
Every syncable object in the Delphix Masking Engine has a revision hash associated with it. The revision hash is provided in calls to GET /syncable-objects and POST /export endpoints. The revision hash field provides a way to detect when an object has changed and thus, in some workflows, the object might need to be resynchronized. The revision hash of an object will change whenever the object itself changes, or whenever one of its dependent objects changes.
For example, if the object is a secure lookup algorithm, the revision hash will change if the algorithm itself is updated. It will also change if the encryption key, which the algorithm depends on, is changed.
Revision hashes are generated randomly when objects are created and are preserved when an object is exported from one engine and imported to another. Therefore if two objects have the same revision hash, the objects are identical, however it is possible that two independently created objects could have different revision hashes but otherwise be identical.
For example, it is possible to manually create secure lookup algorithms on two engines that behave identically (assuming the engines already have the same encryption key) but since they were created independently and not synchronized, their revision hashes may be different.
Export Document
You can export one or more syncable objects that are listed in the /syncable-objects endpoint as a binary object called an export document. The export document will include the set of objects that you requested for export and all dependencies that are required to properly import those objects into another engine.
The export document is exported as an opaque blob. Do not edit it outside of the Delphix Masking Engine.
Export Document Encryption
You can request that the export document be encrypted using a passphrase. Once the document is encrypted with the passphrase, the engine forgets the passphrase. You will need to provide the same passphrase during import to decrypt the document.
Digital Signature
In order to detect accidental or malicious modification of the export document, each document is digitally signed. If the export document does not match its expected digital signature, a Masking Engine will not import the document.
Endpoints
For more detailed API documentation, please refer to the Masking API Cookbook.
GET /syncable-objects
This endpoint lists all objects in an engine that are syncable and can be exported. Any object which can be exported, can be imported into another engine. The endpoint has no parameters. Each object is listed with its revision hash.
POST /export
This endpoint allows you to export one or more objects in batch fashion. This endpoint returns an export document and a set of metadata that describes what was exported. You are expected to specify which objects to export by copying their object identifiers from the output of the /syncable-objects
endpoint.
The endpoint has a single optional header, a passphrase. If you provide the passphrase, the export document will be encrypted using it.
Note that while export operations are in progress, import operations and attempts to use the "Generate New Encryption Key" feature will fail. Wait until all export operations have completed to perform those operations.
Error handling
If an error occurs while exporting one or more elements in the export document, the entire export will abort.
POST /import
This endpoint allows you to import a document exported from another engine. The result of import is a list of objects that were imported and whether the import was successful.
The endpoint has one required parameter, force_overwrite, and an optional HTTP header, passphrase, which if provided, the engine will attempt to decrypt the document using the specified passphrase. The required force_overwrite parameter dictates how to deal with conflicting objects.
Note that only one import operation can be in progress at a time. When an import is in progress, both calls to the export endpoint and attempts to use the "Generate New Encryption Key" feature will fail. Wait until the import is complete to perform those operations.
Import Logic Flow Diagram
Error Handling
Export documents often have multiple objects to be imported at once. For example, when exporting a lookup algorithm, you will export both the algorithm and encryption key since lookup algorithms depend on the Masking Engine’s encryption key.
The engine will import one object at time. If there is an error importing an object, the import process will abort. However, any objects that were imported before the error are left as-is. For example, say you are importing objects A, B, and C. Import successfully imports A. During the import of B, the engine encounters an error. Import will report that A was successfully imported, B failed to import, and C was skipped.
Notes
Specifying force_overwrite=false will always fail to import the encryption key unless the encryption key has been previously synchronized using force_overwrite=true.
Specifying force_overwrite=true will always overwrite the engine’s encryption key with the contents of the encryption key in the export document.
Example User Workflow
The following steps provide an example of how to export one or more objects from Masking Engine A to Masking Engine B.
On Masking Engine A, get the Authorization from the
/login
API.POST http://masking-engine-A:8282/masking/api/login HEADER Content-Type : application/json BODY (raw) {"username": "user123", "password": "pw123" } EXPECTED RESULT { "Authorization": "dc2cff8b-e20d-4e28-8b7e-5d7c4aad0e2a" }
On Masking Engine A, call
GET /syncable-objects
to get a list of syncable objects.GET http://masking-engine-A:8282/masking/api/syncable-objects HEADER Authorization : dc2cff8b-e20d-4e28-8b7e-5d7c4aad0e2a (whatever you get from login) Content-Type : application/json EXPECTED RESULT [ { "objectIdentifier": { "keyId": "global" }, "objectType": "KEY", "revisionHash": "68eaffef400e426520a5fcbb683419db3be53317" }, { "objectIdentifier": { "algorithmName": "AccNoLookup" }, "objectType": "LOOKUP", "revisionHash": "485343f1a68698497946f4f70d1cfdd76d516fd8" }, { "objectIdentifier": { "algorithmName": "AddrLine2Lookup" }, "objectType": "LOOKUP", "revisionHash": "f397c46a97bddacf4203e35d7a538fda4bba6b12" } … ]
On Masking Engine A, call
/export
on objects you want to export.POST http://masking-engine-A:8282/masking/api/export HEADER Authorization : dc2cff8b-e20d-4e28-8b7e-5d7c4aad0e2a (whatever you get from login) Content-Type : application/json passphrase (Optional): password to encrypt the export document BODY [ { "objectIdentifier": { "algorithmName": "msuh_test_demo" }, "objectType": "LOOKUP", "revisionHash": "68eaffef400e426520a5fcbb683419db3be53317" (Optional) } ] EXPECTED RESULT { "exportResponseMetadata": { "exportHost": "masking-engine-A:8282", "exportDate": "Tue Jun 13 14:58:25 UTC 2017", "exportedObjectList": [ { "objectIdentifier": { "keyId": "global" }, "objectType": "KEY", "revisionHash": "68eaffef400e426520a5fcbb683419db3be53317" }, { "objectIdentifier": { "algorithmName": "msuh_test_demo" }, "objectType": "LOOKUP", "revisionHash": "f397c46a97bddacf4203e35d7a538fda4bba6b12" } ] }, "blob": "ChgyMDE3LTA2LTEzVDE0OjU4OjI1LjMwMloawgEKPwgHEjsKJ3R5cGUuZ29vZ2xlYXBpcy5jb20vQWxnb3JpdGhtSWRlbnRpZmllchIQCg5tc3VoX3Rlc3RfZGVtbxp/CiR0eXBlLmdvb2dsZWFwaXMuY29tL1Rva2VuaXphdGlvbkRhdGESVwpKCg5tc3VoX3Rlc3RfZGVtbxIOTVNVSF9URVNUX0RFTU8aCjIwMTctMDYtMTIiDWRlbHBoaXhfYWRtaW4qDWFzZGxrZmphbHNkamYaAzI3NyIBNSoBNxqCAQo7CAkSNwordHlwZS5nb29nbGVhcGlzLmNvbS9FbmNyeXB0aW9uS2V5SWRlbnRpZmllchIICgZnbG9iYWwaQwoldHlwZS5nb29nbGVhcGlzLmNvbS9FbmNyeXB0aW9uS2V5RGF0YRIaChhvd203U2JkazJWdlJkbWJ3Y0p3b2dRPT0=" "signature": "MCwCFHRaXz98fnhTARQq3/WWa/bZvt/aAhRCgYQjBqkxo9iA9/ohEU5ajNXQEQ==", "publicKey": "MIHxMIGoBgcqhkjOOAQBMIGcAkEA/KaCzo4Syrom78z3EQ5SbbB4sF7ey80etKII864WF64B81uRpH5t9jQTxeEu0ImbzRMqzVDZkVG9xD7nN1kuFwIVAJYu3cw2nLqOuyYO5rahJtk0bjjFAkBnhHGyepz0TukaScUUfbGpqvJE8FpDTWSGkx0tFCcbnjUDC3H9c9oXkGmzLik1Yw4cIGI1TQ2iCmxBblC+eUykA0QAAkEA3fAdC2zBB7zpIhPyf1c6na0I1Cp188Gcdaqk8uGZTiOUIh3FgISNlD0ZYRGAH39Uep8+KTkJJU+DB1Vsm23qZA==" }
exportedObjectList returns a list of all exported objects including the dependencies
Example:
Export: lookupAlg A
exportedObjectList: Key, lookupAlg A- Export: lookupAlg A, lookupAlg B
exportedObjectList: Key, lookupAlg A, lookupAlg B
On Masking Engine B, call
/import
to import the exported objects.POST http://masking-engine-B:8282/masking/api/import?force_overwrite=true POST http://masking-engine-B:8282/masking/api/import?force_overwrite=false PARAMETER force_overwrite can either be true or false. See the discussion in /import. HEADER (same as export) BODY (Whatever gets returned from export) EXPECTED RESULT [ { "objectIdentifier": { "keyId": "global" }, "objectType": "KEY", "importStatus": "SUCCESS" }, { "objectIdentifier": { "algorithmName": "msuh_test_demo" }, "objectType": "LOOKUP", "importStatus": "SUCCESS" } ]
Attempting to Import an Existing Object
During the import of an object, the Delphix Masking Engine checks for the existence of the same object contents. If the engine and the document being imported contain the same content, a result of SUCCESS will be returned without repeating the work of a full import. For example, re-encrypting an engine can be very time consuming, and this should not be repeated if the encryption keys already match. If the object content matches and the Delphix Masking Engine skips the full import, it will be noted in the application log.
Below is an example log statement when an identical encryption key was imported:
2017-07-19 10:17:06,075 [http-nio-8282-exec-4] INFO c.d.s.marshallers.SyncableMarshaller - Skipping import process for { "objectType": "KEY", "id": { "@type": "type.googleapis.com/EncryptionKeyIdentifier", "id": "global" } }, due to no discrepancy between the existing and importing object
When Encryption Keys Change
If the encryption key on Delphix Masking Engine A is regenerated after the algorithm export in Step 3 above, the masking results for Delphix Masking Engine A and Delphix Masking Engine B will differ. To synchronize the results, the export in Step 3 on Delphix Masking Engine A and the import in Step 4 on Delphix Masking Engine B would need to be repeated.
Managing Encryption Keys
One important piece of data used by many masking algorithms is a shared encryption key. This key is used to encrypt data that is stored in the application repository within the masking engine. It is also used by most of the masking algorithms, such as secure lookup algorithms. Changing the key changes the output of these algorithms. For example, if the FIRST NAME algorithm masked “Joe” to “Matt,” changing the key might cause it to mask “Joe” to “George”. This allows each masking engine to have unique masking algorithm output. A user with Administrator privileges can change the key by clicking the Generate New Key button in the Admin tab.
Other actions are not allowed during the key generation process. Wait for the Generate New Key process to complete and a success dialogue to display in the user interface before performing additional actions on the Delphix Masking Engine such as running a masking job.
Delphix Masking Engine Admin Tab
Synchronizing Keys between Multiple Engines
In order for an algorithm to behave the same way across several engines, all of those engines must have the same key. Changing an engine’s key alters the behavior of all of the algorithms on that engine that use the key.
When an algorithm that requires the encryption key is exported, the export payload will include the key from that engine. When that payload is then imported into another engine, if the importing engine is not already using the key in the payload, its key will be changed. Therefore, it is best to synchronize the keys of all engines at deployment time.
You may want to change the key from time to time as a security management practice. If so, change it on all of the engines at the same time. That is, generate a new key on one engine, export that key, and import it to all of the other engines in the deployment.
Keys can be imported and exported independently of algorithms. To export the key from an engine, login to the engine through the login endpoint and then call export with the body shown below. Like all objects, you can encrypt the payload by supplying a passphrase header.
[ { "objectIdentifier": { "keyId": "global" }, "objectType": "KEY" } ]
The API will return a JSON payload containing an encoded form of the key that you can install on other engines through the import endpoint. Like all exported objects, it is encoded in an opaque blob.
Best Practice Guide and Example Architectures for Synchronizing
Algorithm synchronization provides a general and flexible way to move masking algorithms between engines. It is recommended that algorithms move in only one direction. That is, algorithms should be exported from one engine and imported into others but should not go in the other direction. This recommendation is primarily to simplify management of the engines and keeping track of what algorithms exist on which engines.
For the reasons described in the Key Management section above, the first step to deploying any multi-engine configuration should be to synchronize the key among all of the engines involved. This reduces concerns about unexpected key changes causing algorithm masking results to change.
Two example architectures are described below. Note that the two architectures could be combined by having multiple production engines instead of a single one.
Horizontal Scale
The first architecture aims to address the problem of horizontal scale -- that is, achieving consistent masking across a large data estate by deploying multiple masking engines. In this architecture, algorithms are authored on one engine, labeled “Control Masking Engine” in the diagram below. Those algorithms are then distributed to “Compute Masking Engines” using the algorithm synchronization APIs. The synchronized algorithms will produce the same masked output on all of the engines, thus enabling large data estates to be masked consistently.
Horizontal Scale Use Case Diagram
SDLC
The second architecture addresses the desire to author algorithms on one engine, to test and certify them on another, and finally to deploy them to a production engine. Here, algorithms are authored on the first engine, labeled “Dev Engine” in the diagram below. When the developer is satisfied, the algorithms are exported from the Dev Engine and imported to the QA Engine where they can be tested and certified. Finally, they are exported from the QA engine and imported to the production engine. As seen in the diagram below, to ensure that your masked results in Production are secure it is advised that the Production engine uses a different key than the Dev or QA engines.
SDLC Use Case Diagram
To maintain consistent masking results in the SDLC use case and continue producing the same masked output on a production engine:
- Export the production engine's encryption key.
- Import the newly developed algorithm from non-production. This will force a time-consuming rekey of the production engine.
- Import the production engine's encryption key from step #1. This will force another time-consuming rekey of the production engine.
Algorithm Syncability
The following tables specify which algorithms are syncable between masking engines (in addition to the masking engine key).
Only users with masking admin privilege are able to export and import algorithms.
User-Defined Algorithms
Type | Syncable | ALternative |
---|---|---|
Lookup | Yes | N/A |
Binary Lookup | Yes | N/A |
Segment Mapping | Yes | N/A |
Mapping | No | None |
Tokenization | Yes | N/A |
Minmax | No | Enter the same min, max and replacement values |
Cleansing | No | Upload the same cleansing rules file |
Free Text Redaction | No | Enter the same redaction rules |
Custom Algorithm/Mapplet | Yes | N/A |
Built-In Algorithms
Algorithm API Name | Algorithm UI Name | Type | Syncable | Alternative |
---|---|---|---|---|
AccNoLookup | ACCOUNT SL | lookup | Yes | NA |
AccountTK | ACCOUNT_TK | tokenization | Yes | NA |
AddrLine2Lookup | ADDRESS LINE 2 SL | lookup | Yes | NA |
AddrLookup | ADDRESS LINE SL | lookup | Yes | NA |
BusinessLegalEntityLookup | BUSINESS LEGAL ENTITY SL | lookup | Yes | NA |
CommentLookup | COMMENT SL | lookup | Yes | NA |
CreditCard | CREDIT CARD | calculated | No | None |
DateShiftDiscrete | DATE SHIFT(DISCRETE) | calculated | No | Sync the EncryptionKey |
DateShiftFixed | DATE SHIFT(FIXED) | calculated | No | Nothing to synchronize |
DateShiftVariable | DATE SHIFT(VARIABLE) | calculated | No | None |
DrivingLicenseNoLookup | DR LICENSE SL | lookup | Yes | NA |
DummyHospitalNameLookup | DUMMY_HOSPITAL_NAME_SL | lookup | Yes | NA |
EmailLookup | EMAIL SL | lookup | Yes | NA |
FirstNameLookup | FIRST NAME SL | lookup | Yes | NA |
FullNMLookup | FULL_NM_SL | lookup | Yes | NA |
LastNameLookup | LAST NAME SL | lookup | Yes | NA |
LastCommaFirstLookup | LAST_COMMA_FIRST_SL | lookup | Yes | NA |
NameTK | NAME_TK | tokenization | Yes | NA |
NullValueLookup | NULL SL | lookup | Yes | NA |
TelephoneNoLookup | PHONE SL | lookup | Yes | NA |
RandomValueLookup | RANDOM_VALUE_SL | lookup | Yes | NA |
SchoolNameLookup | SCHOOL NAME SL | lookup | Yes | NA |
SecureShuffle | SECURE SHUFFLE | calculated | No | None |
SsnTK | SSN_TK | tokenization | Yes | NA |
USCountiesLookup | US_COUNTIES_SL | lookup | Yes | NA |
USCitiesLookup | USCITIES_SL | lookup | Yes | NA |
USstatecodesLookup | USSTATE_CODES_SL | lookup | Yes | NA |
USstatesLookup | USSTATES_SL | lookup | Yes | NA |
WebURLsLookup | WEB_URLS_SL | lookup | Yes | NA |
RepeatFirstDigit | ZIP+4 | calculated | No | Nothing to synchronize |
Custom Algorithms
Custom algorithms (mapplets) are syncable between masking engines if they are self-contained in the mapplet implementation file. Any other dependencies outside the implementation file, including the masking encryption key, will not be exported from one masking engine and imported into another unless you explicitly manage them. You can manage dependencies on the masking engine encryption key by explicitly requesting the export of the encryption key along with the custom algorithm. Other dependencies, such as data on local file systems or databases (including MDS), must be manually copied from one Delphix Masking Engine to another.