This page provides definitions of major concepts.
Compatible Platforms with which to Use Delphix
System | Explanation |
---|---|
Adaptive Server Enterprise (ASE) | Proprietary RDBMS from SAP. Sometimes known by its former name, Sybase ASE. For more information on SAP ASE and Delphix, see SAP ASE Environments and Data Sources. |
Customer Relationship Management (CRM) system | A database of customer data that is tied into applications which deliver information about the data. |
DB2 | Proprietary RDBMS from IBM. For more information on DB2 and Delphix, see DB2 on Delphix: An Overview. |
EBS | Oracle E-Business Suite. For more information on EBS and Delphix, see Oracle E-Business Suite and Delphix Conceptual Overview. |
MySQL | An open source RDBMS from Oracle. It runs on both Linux and Microsoft Windows operating systems. |
Oracle Database Server | Proprietary RDBMS from Oracle that runs on various operating systems such as Linux and Microsoft Windows. There are several editions including Standard and Enterprise. |
SQL Server | Proprietary RDBMS product from Microsoft that runs on Microsoft Windows operating systems. |
Terms for Using the Delphix Engine
Ways to Access the Delphix Engine
Term | Explanation |
---|---|
Application Programming Interface (API) | A method by which you can access a Delphix Engine programmatically. A set of tools and protocols which enable access to an application via software calls using the API. |
Command Line Interface (CLI) | A method by which you can access a Delphix Engine using SSH, which supports input of text commands. |
Graphical User Interface (GUI) | The browser-based method to direct the operations of a Delphix Engine. |
Delphix Concepts
15%|Term | Explanation |
---|---|
Automation Engine | A generic name for third party tools which call Delphix APIs based on external events. For example:
|
Blocks or data blocks | The mapped subsets of the entire data set, which can be individually addressed, refreshed, compressed, and accessed. |
Data source | The system, typically an RDBMS, that feeds information to the Delphix Engine, and from which virtual objects are derived. A data source can be a database, an application, or a set of unstructured files; a VDB can also serve as a data source for other Delphix Engines. Not to be confused with a dSource, which is a virtualized, compressed duplicate of this database. (See below.) |
DataVisor | Orchestrates tasks such as synchronization, synthesis and recording of changes, data movement (across copies), replication. One of the three tiers of the Delphix technology stack, which also includes the Delphix Filesystem (DxFS) and self-service management. |
Delphix Connector | A service that runs on a Windows proxy host and enables communication between the Delphix Engine and the Windows target environment where it is installed. |
Delphix Engine | A virtual machine containing a Delphix installation. Leverages existing SAN storage to store compressed copies of source data. Supplies data to remote servers over NFS and iSCSI. |
dSource | The virtualized representation of a database that is created by the Delphix Engine. As a virtualized representation, it cannot be managed, manipulated, or examined by database tools. Because dSources are simply source data, you must leverage a VDB in order to distribute/clone/test the data being pulled in. VDBs can also later be refreshed from the dSource's data as it is pulled in. An object on the Delphix Engine that outlines how data should be imported from a data source and managed on the Delphix Engine. |
Filesystem (DxFS) | The filesystem used by the Delphix Engine. Stores and manages application data and is responsible for optimization of storage and performance. One of the three tiers of the Delphix technology stack, along with DataVisor and self-service management. |
Delphix OS (DxOS) | The underlying operating system running on a Delphix Engine |
Domain | Collective name for data objects, such as dSources, virtual databases (VDBs), users, groups, and related policies and resources. |
Environment | An umbrella term for a host or a cluster. In order to mask or provision databases and files within Delphix, you first need to create an Environment in which Delphix will store the connection information and masking and provisioning rules for those data stores. An environment can contain multiple database connections and multiple file connections. |
Hooks | Delphix initiated calls to external scripts used to automate tasks, primarily on VDBs and dSources. |
Host | The physical or logical machine that contains database instances. A host can be distinguished from an environment because the host has a physical reference point in its IP address. For example, you can specify a host (by referring to its host name or IP address) where an environment is located. |
HostChecker | A standalone program which validates that host machines are configured correctly before the Delphix Engine uses them as data sources and provision targets. HostChecker is a Delphix script that you should run before adding any environment. It is available to download from the HostChecker subfolder at download.delphix.com. |
LogSync | Delphix Engine feature which enables the ingestion and retention of more granular (log-based) source timeflow – at the cost of additional storage. The granular TimeFlow allows for VDB point-in-time provision or refresh. |
Object Groups | Object groups are arbitrary collections of dSources or VDBs used for organization. (User groups are not supported in the current version) |
Replicas | Copies of the source Delphix Engine information on the target Delphix Engine, which can include objects such as dSources, VDBs and vFiles. Replicas preserve object relationships and naming nomenclatures. |
Replication | You can replicate data objects between Delphix Engines. Replication consists of a profile-replica pair. It is configured on the source Delphix Engine and copies a subset of dSources and VDBs to a target Delphix Engine. The source engine then sends incremental updates manually or according to a schedule. In addition, you can provision VDBs from replicated objects, allowing for geographical distribution of data and remote provisioning. |
Replication profile | Replication on the source. Formerly called "Replication spec." |
Snapshots | Snapshots represent the state of a VDB at a specific moment in time. They accumulate over time or are generated by your input. Snapshots appear as cards in the TimeFlow section of a VDB or vFiles, allowing you to choose a point in time from which to provision. If you have logsync enabled, you can provision from a point in time between the snapshots cards if you also have the archive logs available. |
SnapSync | The standard process for importing data from a linked source into the Delphix Engine. An initial SnapSync is performed to create a copy of data on the Delphix Engine. Incremental SnapSyncs are performed to update the copy of data on the Delphix Engine. |
Source database | The orginal (sometimes physical) database that is usually the production database at a site, although it could be any database that the user designates as a source. Delphix creates a dSource from the Source Database. |
Source environment | An environment from which the Delphix Engine can capture data. |
Staging environment | An environment suitable for facilitating resource-intensive portions of the linking process and SnapSync |
Target environment | A host (or cluster) on which the Delphix Engine will create VDBs |
TimeFlow | The collection of snapshots created by SnapSync policies or, in the case of Microsoft SQL, the pre-provisioning process. When you provision a VDB, you pick a point in the TimeFlow from which to provision. |
Unstructured files | Data stored in a filesystem that is NOT usually accessed by a DBMS or similar software. Unstructured files can consist of anything from a simple directory to the root of a complex application like Oracle E-Business Suite. They are a dataset that is treated as simply a directory tree full of files. Like with other data types, you can configure a dSource to sync periodically with a set of unstructured files external to the Delphix Engine. Virtualized unstructured files are called vFiles (see below). |
Validated Sync | The process that runs on a staging database within a staging environment, and which executes either before a snapshot is taken (SQL Server) or after the snapshot is taken (Oracle). |
vFiles | Virtual data files. A virtual copy of data files created and managed by Delphix. Virtual data files are fully functional read/write copies of the original data files source. They may be managed by an AppData toolkit. You can mount vFiles across one target environment or many. |
vFiles ( Empty) | Creating an empty vFiles places an initially-empty mount on target environments. You can then create data directly on Delphix. This is useful when you have no existing files to copy into the Delphix Engine, but you do have files which you will generate, track, and copy with vFiles. For more information, see Creating Empty vFiles from the Delphix Engine. |
Virtual dataset | Comprehensive term that includes VDBs and vFiles. |
V2P | Virtual to Physical. This refers to the process of moving a VDB to a physical database, for example in a disaster recovery situation. |
VDB (virtual database) | A database provisioned from either a dSource or another VDB which is a full read/write copy of the source data. A VDB is created and managed by the Delphix Engine. |
Data Operations
The terms below describe actions you can perform on a Delphix Engine.
Term | Explanation |
---|---|
Linking | The process of establishing a relationship between a data source and the Delphix Engine. After linking a data source, the Delphix Engine can import data periodically and manage it as it evolves over time. In the GUI, synonymous with "Add dSource." |
Masking | Masking replaces sensitive data with fictitious data, which you can then move out of your production environment and into non-prod environments. It provides realistic data with which to work while reducing security risks. For more details about masking, see Masking Terms below. |
Migrating a VDB | Moving a VDB to a new target environment |
Provision | Create a new physical or virtual database |
Refresh | Refreshing a VDB will re-provision it from the dSource. As with the normal provisioning process, you can choose to refresh the VDB from a snapshot or a specific point in time. Refreshing a VDB will delete any changes that have been made to it over time; you are essentially re-setting it to the state you select during the refresh process. Even though the "lost" history of the VDB is not visible to the user in the UI, it is still stored in the Delphix engine and is available via the CLI Refreshing is a more expensive process than a rewind because during the refresh, the parent TimeFlow is made available for the user to be able to choose a point in time (using a LogSync point) or a snapshot. |
Rewind | Rewinding a VDB rolls it back to a previous point in its TimeFlow and re-provisions the VDB. The VDB will no longer contain changes after the rewind point. Although the VDB no longer contains changes after the rewind point, the rolled over Snapshots and TimeFlow still remain in Delphix and are accessible through the Command Line Interface (CLI). For instructions on how to use these snapshots to refresh a VDB to one of its later states after it has been rewound, see the topic CLI Cookbook: Rolling Forward a VDB. |
Users and Privileges
Object | User Privileges | Group Privileges |
---|---|---|
Reader | Can access statistics on the dSource, VDB, or snapshot such as usage, history, and space consumption | Can access statistics on all dSources, VDBs, or snapshots in the group such as usage, history, and space consumption |
Provisioner |
|
|
Owner |
|
|
Data operator |
|
|
delphix_admin | Manages all data objects: dSources, virtual databases (VDBs), users, groups, and related policies and resources, all collectively referred to as the Delphix Engine "domain." The delphix_admin user manages the Delphix Engine domain using either the browser-based Delphix Admin application or the Command Line Interface (CLI). | |
sysadmin user | Can perform typical system administration duties such as: modifying NTP, SNMP, SMTP settings; managing storage; downloading support logs for the Delphix Engine; and performing upgrades and patches. The sysadmin user launches the initial Server Setup configuration application and has access to the Command Line Interface (CLI). | |
Types of Notification
Type | Notification |
---|---|
Event | Completion of some action in the Delphix Engine |
Alert | Caused by a single event on a Delphix Engine. Also known as a System Event, and viewable through the System Event Viewer. Alert Levels: Informational, Warning, Critical |
Fault | A persistent event on a Delphix Engine that remains until the issue is resolved. The fault may be marked resolved automatically or require that it be resolved manually. System faults describe states and configurations that may negatively impact the functionality of the Delphix Engine and which can only be resolved through active user intervention. Examples: Delphix Engine storage failure, Communication failures between the Delphix Engine and a source or target environment/host Fault Levels: Warning, Critical |
Useful Metrics for Monitoring the Impact of your Delphix Engine
Metric | Explanation |
---|---|
Capacity metrics | Common metrics for a host include CPU, RAM and Disk and Network utilization. In addition to the utilization of these resources, the response times (latency) are also critical - especially for Disk and Network. |
Consolidation ratio | The amount of space that dSources and VDBs occupy compared to the amount that would be occupied by a traditional physical database |
Granularity | |
Retention ratio | |
RPO (Recovery Point Objective) | The acceptable amount of data that can be lost in the event of a failure. For example, if backups are taken once a day, then at most 24 hours of data will be lost if the system fails immediately before a regularly scheduled backup. |
RTO (Recovery Time Objective) | The time required to restore the system to an operational state after a failure. For example, a recovery may require restoring data from from a backup, followed by some number of manual steps to recreate the configuration in the new system. RTO is equivalent to the downtime experienced. |
Jet Stream Terms
Term | Explanation |
---|---|
Jet Stream administrator | Has full access to all report data and can configure Jet Stream. Additionally, can use the Delphix data platform to:
|
Bookmark | A logical reference to a point in time on a branch. You can use it as a point from which to fork new branches. It can also be the target of policies – for example, you can arrange to keep this bookmark for two years. Bookmarks are a way to mark and name a particular moment of data on a timeline. You can restore the active branch's timeline to the moment of data marked with a bookmark. You can also share bookmarks with other Jet Stream users, which allows them to restore their own active branches to the moment of data in your container. The data represented by a bookmark is protected and will not be deleted until the bookmark is deleted. |
Branches | A time-ordered collection of timelines. They are task-specific groupings you can create within a data container. A branch is used to track a logical task, and contains a timeline of the historical data for that task. As you work within your data container, you can create more branches over time to run or complete separate tasks. They represent a logical sequence of activity, separate from the underlying data lineage. This is the main concept introduced in the core platform and forms the basis of many higher level primitives. Branches: |
Branch group / target group | A collection of multiple branches or targets that are treated as a single entity. The system can determine compatibility automatically, or a template can be used to create more complex orchestration. |
Branch timeline | A dynamic point-in-time interface for user actions within the branch. Common activities include re-setting data sources to run a test, refreshing the data container with the most current source data, and bookmarking data to share or track interesting moments of time along the branch timeline. |
Data container | Consists of one or more data sources, such as databases, application binaries, or other application data. Allows users to:
|
Data template | Created by the Delphix administrator, data templates consist of the data sources users need in order to manage their data playground and their testing and/or development environments. Data templates serve as the parent for a set of data containers that the administrator assigns to Jet Stream users. Additionally, data templates enforce the boundaries for how data is shared. Data can only be shared directly with other users whose containers were created from the same parent data template. |
Jet Stream data user | Jet Stream data users have access to production data provided in a data container. The data container provides these users with a playground in which to work with data using the Self-Service Toolbar. |
Mission Control Terms
Term | Explanation |
---|---|
Admin user | Admin users have full access to all report data and can configure the Mission Control appliance. For example, they can:
|
Auditor user | Auditor users can only view report data. Admin users can also assign auditor users a set of tags (arbitrary text strings) to restrict which report data they can view. There is no default auditor account. The first Delphix Administrator will need to create the auditor users and will be responsible for creating their User IDs and Passwords. |
Reports | Reports present aggregated data across all connected Delphix Engines. Interactive reports such as Storage Breakdown and History display interactive graphical representations of historical and current storage usage across all Delphix Engines you are monitoring. These visualizations of storage and disk capacity enable you to analyze and mediate storage across Delphix Engines from multiple perspectives. |
Tagging | You can tag Delphix Engines in Mission Control with a set of arbitrary text strings. You can then filter reports to show only data from Delphix Engines with a certain tag. You can also use tags to restrict auditor users so that they can only view data from Delphix Engines with that tag. |
Masking Terms
Engine Types
Term | Explanation |
---|---|
Standalone Masking Engine | This Engine is deployed as an OVA (Open Virtualization Archive) in a compatible hypervisor and contains the Masking Engine GUI. From here you can create masking jobs, mask data, and administer your Masking Engine. This Engine type is suitable for Delphix installations below Delphix 5.0. |
Combined Delphix Engine and Masking Engine | This Engine is built into your Delphix 5.0 and above installation. It contains both the Delphix Engine GUI and Masking Engine GUI, and allows tighter integration between Delphix's Data as a Service and Masking features. |
What Goes into Masking
Term | Explanation |
---|---|
Application | The IT assets (programs, data, processes) that support a business function. For example, if a bank offers payroll services to its clients, there would be an application in its IT division to support that business. |
Connector | Where the Delphix Engine stores JDBC database connection information. Builds a connection between the source database and the masking interface. |
Domain (Masking) | The domain represents the correlation between various sensitive data categories and the masking algorithm which will be applied to them |
Masking environment | Defines the scope of work in the Masking Engine. A collection of masking constructs (connectors, rule sets / inventories, and jobs) that support masking for a given application environment. In order to mask databases and files within the Delphix Engine, you first need to create an environment in which the Delphix Engine will store the connection information and masking rules for those data stores. An environment can contain multiple database connections and multiple file connections. Environments are connected to applications for informational purposes. |
In-place masking | "Mask data in place" refers to updating a database with masked data. This includes reading data from the table defined in the rule set, masking the data in the Masking Engine, and updating the tables with the masked data. |
Inventory | The Delphix Engine automatically stores the masking rules for each sensitive column in the Delphix repository database in the environment's "inventory." When you select a table to mask, its columns will appear, and you can select them for masking. Afterwards, you can edit the columns with an appropriate algorithm required for masking. |
Masked VDB | A virtual database with masked data |
On-the-fly masking | With on-the-fly masking, you specify the source of the information to be masked, and where the masked data will be loaded. On-the-fly masking is an Extract Transform Load (ETL) process. |
Profile data | A way to identify the location of Non-Public Information (NPI) or sensitive data if you are unsure of what data needs to be masked in the first place. Profiling data is not necessary when you have already identified the sensitive data you need to mask. |
Rule set | Points to a collection of tables or flat files that the Masking Engine uses for masking data. The rule set allows you to identify, select, and configure which tables you need to mask. For those tables that do not have a primary key defined, you can define a logical key with a combination of columns (or ROWID for Oracle database). |
Selective Data Distribution (SDD) | Permits the distribution of masked data between Delphix Engines. The sources received on a target Delphix Engine do not include the original parent source, thereby making the original source inaccessible from the target . |
Masking Algorithms
Algorithm | Description |
---|---|
Secure Lookup | The most commonly used type of algorithm. It is easy to generate and works with different languages. When this algorithm replaces real, sensitive data with fictional data, it is possible that it will create repeating data patterns, known as “collisions.” For example, the names “Tom” and “Peter” could both be masked as “Matt.” Because names and addresses naturally recur in real data, this mimics an actual data set. However, if you want the masking engine to mask all data into unique outputs, you should use segmented mapping, described below. |
Segmented Mapping | Produces no overlaps or repetitions in the masked data. You can mask up to a maximum of 36 values using segmented mapping. You might use this method if you need columns with unique values, such as Social Security Numbers, primary key columns, or foreign key columns. You can set the algorithm to produce alphanumeric results (letters and numbers) or only numbers. |
Mapping | Allows you to state what values will replace the original data. There will be no collisions in the masked data, because it always matches the same input to the same output. For example “David” will always become “Ragu,” and “Melissa” will always become “Jasmine.” The algorithm checks whether an input has already been mapped; if so, the algorithm changes the data to its designated output. You can use a mapping algorithm on any set of values, of any length, but you must know how many values you plan to mask. NOTE: When you use a mapping algorithm, you cannot mask more than one table at a time. You must mask tables serially. |
Binary Lookup | Replaces objects that appear in object columns. For example, if a bank has an object column that stores images of checks, you can use a binary lookup algorithm to mask those images. The Delphix Engine cannot change data within images themselves, such as the names on X-rays or driver’s licenses. However, you can replace all such images with a new, fictional image. This fictional image is provided by the owner of the original data. |
Tokenization | The only type of algorithm that allows you to reverse its masking. For example, you can use a tokenization algorithm to mask data before you send it to an external vendor for analysis. The vendor can then identify accounts that need attention without having any access to the original, sensitive data. Once you have the vendor’s feedback, you can reverse the masking and take action on the appropriate accounts. Like mapping, a tokenization algorithm creates a unique token for each input such as “David” or “Melissa.” The Delphix Engine stores both the token and the original so that you can reverse masking later. |
Min Max | Values that are extremely high or low in certain categories allow viewers to infer someone’s identity, even if their name has been masked. For example, a salary of $1 suggests a company’s CEO, and some age ranges suggest higher insurance risk. You can use a min max algorithm to move all values of this kind into the midrange. |
Data Cleansing | Does not perform any masking. Instead, it standardizes varied spellings, misspellings, and abbreviations for the same name. For example, “Ariz,” “Az,” and “Arizona” can all be cleansed to “AZ.” |
Free Text Redaction | Helps you remove sensitive data that appears in free-text columns such as “Notes.” This type of algorithm requires some expertise to use, because you must set it to recognize sensitive data within a block of text. One challenge is that individual words might not be sensitive on their own, but together they can be. The algorithm uses profiler sets to determine what information it needs to mask. You can decide which expressions the algorithm uses to search for material such as addresses. For example, you can set the algorithm to look for “St,” “Cir,” “Blvd,” and other words that suggest an address. You can also use pattern matching to identify potentially sensitive information. For example, a number that takes the form 123-45-6789 is likely to be a Social Security Number. You can use a free text redaction algorithm to show or hide information by displaying either a “black list” or a “white list.” |