This page provides definitions of major concepts.

Compatible Platforms with which to Use Delphix

SystemExplanation
Adaptive Server Enterprise (ASE)

Proprietary RDBMS from SAP. Sometimes known by its former name, Sybase ASE.

For more information on SAP ASE and Delphix, see SAP ASE Environments and Data Sources.

Customer Relationship Management (CRM) systemA database of customer data that is tied into applications which deliver information about the data.
DB2

Proprietary RDBMS from IBM.

For more information on DB2 and Delphix, see DB2 on Delphix: An Overview.

EBS

Oracle E-Business Suite.

For more information on EBS and Delphix, see Oracle E-Business Suite and Delphix Conceptual Overview.

MySQLAn open source RDBMS from Oracle. It runs on both Linux and Microsoft Windows operating systems.
Oracle Database ServerProprietary RDBMS from Oracle that runs on various operating systems such as Linux and Microsoft Windows. There are several editions including Standard and Enterprise.
SQL ServerProprietary RDBMS product from Microsoft that runs on Microsoft Windows operating systems.

Terms for Using the Delphix Engine

Ways to Access the Delphix Engine

TermExplanation
Application Programming Interface (API)

A method by which you can access a Delphix Engine programmatically.

A set of tools and protocols which enable access to an application via software calls using the API.

Command Line Interface (CLI)A method by which you can access a Delphix Engine using SSH, which supports input of text commands.
Graphical User Interface (GUI)The browser-based method to direct the operations of a Delphix Engine.

Delphix Concepts

15%|TermExplanation
Automation Engine

A generic name for third party tools which call Delphix APIs based on external events. For example:

  • Generation of VDBs from configuration templates and scheduled refreshes
  • Time and labor savings along with independent data access
Blocks or data blocksThe mapped subsets of the entire data set, which can be individually addressed, refreshed, compressed, and accessed.
Data source

The system, typically an RDBMS, that feeds information to the Delphix Engine, and from which virtual objects are derived. A data source can be a database, an application, or a set of unstructured files; a VDB can also serve as a data source for other Delphix Engines.

Not to be confused with a dSource, which is a virtualized, compressed duplicate of this database. (See below.)

DataVisor

Orchestrates tasks such as synchronization, synthesis and recording of changes, data movement (across copies), replication.

One of the three tiers of the Delphix technology stack, which also includes the Delphix Filesystem (DxFS) and self-service management.

Delphix ConnectorA service that runs on a Windows proxy host and enables communication between the Delphix Engine and the Windows target environment where it is installed.
Delphix EngineA virtual machine containing a Delphix installation. Leverages existing SAN storage to store compressed copies of source data. Supplies data to remote servers over NFS and iSCSI.
dSource

The virtualized representation of a database that is created by the Delphix Engine. As a virtualized representation, it cannot be managed, manipulated, or examined by database tools. Because dSources are simply source data, you must leverage a VDB in order to distribute/clone/test the data being pulled in. VDBs can also later be refreshed from the dSource's data as it is pulled in.


An object on the Delphix Engine that outlines how data should be imported from a data source and managed on the Delphix Engine.
Filesystem (DxFS)

The filesystem used by the Delphix Engine. Stores and manages application data and is responsible for optimization of storage and performance.

One of the three tiers of the Delphix technology stack, along with DataVisor and self-service management.

Delphix OS (DxOS)The underlying operating system running on a Delphix Engine
DomainCollective name for data objects, such as dSources, virtual databases (VDBs), users, groups, and related policies and resources.
Environment

An umbrella term for a host or a cluster.

In order to mask or provision databases and files within Delphix, you first need to create an Environment in which Delphix will store the connection information and masking and provisioning rules for those data stores. An environment can contain multiple database connections and multiple file connections.

HooksDelphix initiated calls to external scripts used to automate tasks, primarily on VDBs and dSources.
HostThe physical or logical machine that contains database instances. A host can be distinguished from an environment because the host has a physical reference point in its IP address. For example, you can specify a host (by referring to its host name or IP address) where an environment is located.
HostChecker

A standalone program which validates that host machines are configured correctly before the Delphix Engine uses them as data sources and provision targets.

HostChecker is a Delphix script that you should run before adding any environment. It is available to download from the HostChecker subfolder at download.delphix.com.

LogSyncDelphix Engine feature which enables the ingestion and retention of more granular (log-based) source timeflow – at the cost of additional storage. The granular TimeFlow allows for VDB point-in-time provision or refresh.
Object GroupsObject groups are arbitrary collections of dSources or VDBs used for organization. (User groups are not supported in the current version)
Replicas

Copies of the source Delphix Engine information on the target Delphix Engine, which can include objects such as dSources, VDBs and vFiles. Replicas preserve object relationships and naming nomenclatures.

Replication

You can replicate data objects between Delphix Engines. Replication consists of a profile-replica pair. It is configured on the source Delphix Engine and copies a subset of dSources and VDBs to a target Delphix Engine. The source engine then sends incremental updates manually or according to a schedule. In addition, you can provision VDBs from replicated objects, allowing for geographical distribution of data and remote provisioning.

Replication profileReplication on the source. Formerly called "Replication spec."
Snapshots

Snapshots represent the state of a VDB at a specific moment in time. They accumulate over time or are generated by your input. Snapshots appear as cards in the TimeFlow section of a VDB or vFiles, allowing you to choose a point in time from which to provision.

If you have logsync enabled, you can provision from a point in time between the snapshots cards if you also have the archive logs available.

SnapSync

The standard process for importing data from a linked source into the Delphix Engine. An initial SnapSync is performed to create a copy of data on the Delphix Engine. Incremental SnapSyncs are performed to update the copy of data on the Delphix Engine.

Source database

The orginal (sometimes physical) database that is usually the production database at a site, although it could be any database that the user designates as a source. Delphix creates a dSource from the Source Database.

Source environmentAn environment from which the Delphix Engine can capture data.
Staging environmentAn environment suitable for facilitating resource-intensive portions of the linking process and SnapSync
Target environmentA host (or cluster) on which the Delphix Engine will create VDBs
TimeFlowThe collection of snapshots created by SnapSync policies or, in the case of Microsoft SQL, the pre-provisioning process. When you provision a VDB, you pick a point in the TimeFlow from which to provision.
Unstructured filesData stored in a filesystem that is NOT usually accessed by a DBMS or similar software. Unstructured files can consist of anything from a simple directory to the root of a complex application like Oracle E-Business Suite. They are a dataset that is treated as simply a directory tree full of files. Like with other data types, you can configure a dSource to sync periodically with a set of unstructured files external to the Delphix Engine. Virtualized unstructured files are called vFiles (see below).
Validated SyncThe process that runs on a staging database within a staging environment, and which executes either before a snapshot is taken (SQL Server) or after the snapshot is taken (Oracle).
vFiles

Virtual data files. A virtual copy of data files created and managed by Delphix. Virtual data files are fully functional read/write copies of the original data files source. They may be managed by an AppData toolkit. You can mount vFiles across one target environment or many.

vFiles ( Empty)Creating an empty vFiles places an initially-empty mount on target environments. You can then create data directly on Delphix. This is useful when you have no existing files to copy into the Delphix Engine, but you do have files which you will generate, track, and copy with vFiles. For more information, see Creating Empty vFiles from the Delphix Engine.
Virtual datasetComprehensive term that includes VDBs and vFiles.
V2PVirtual to Physical. This refers to the process of moving a VDB to a physical database, for example in a disaster recovery situation.
VDB (virtual database)A database provisioned from either a dSource or another VDB which is a full read/write copy of the source data. A VDB is created and managed by the Delphix Engine.

Data Operations

The terms below describe actions you can perform on a Delphix Engine.

TermExplanation
LinkingThe process of establishing a relationship between a data source and the Delphix Engine. After linking a data source, the Delphix Engine can import data periodically and manage it as it evolves over time. In the GUI, synonymous with "Add dSource."
MaskingMasking replaces sensitive data with fictitious data, which you can then move out of your production environment and into non-prod environments. It provides realistic data with which to work while reducing security risks. For more details about masking, see Masking Terms below.
Migrating a VDBMoving a VDB to a new target environment
ProvisionCreate a new physical or virtual database
Refresh

Refreshing a VDB will re-provision it from the dSource. As with the normal provisioning process, you can choose to refresh the VDB from a snapshot or a specific point in time. Refreshing a VDB will delete any changes that have been made to it over time; you are essentially re-setting it to the state you select during the refresh process.

Even though the "lost" history of the VDB is not visible to the user in the UI, it is still stored in the Delphix engine and is available via the CLI

Refreshing is a more expensive process than a rewind because during the refresh, the parent TimeFlow is made available for the user to be able to choose a point in time (using a LogSync point) or a snapshot.

Rewind

Rewinding a VDB rolls it back to a previous point in its TimeFlow and re-provisions the VDB. The VDB will no longer contain changes after the rewind point.

Although the VDB no longer contains changes after the rewind point, the rolled over Snapshots and TimeFlow still remain in Delphix and are accessible through the Command Line Interface (CLI). For instructions on how to use these snapshots to refresh a VDB to one of its later states after it has been rewound, see the topic CLI Cookbook: Rolling Forward a VDB.

Users and Privileges

ObjectUser PrivilegesGroup Privileges
Reader

Can access statistics on the dSource, VDB, or snapshot such as usage, history, and space consumption

Can access statistics on all dSources, VDBs, or snapshots in the group such as usage, history, and space consumption

Provisioner
  • Can access statistics on the dSource, VDB, or snapshot such as usage, history, and space consumption
  • Can provision VDBs from owned dSources and VDBs
  • Can access statistics on all dSources, VDBs, or snapshots in the group such as usage, history, and space consumption
  • Can provision VDBs from all dSources and VDBs in the group
Owner
  • Can provision VDBs from owned dSources and VDBs
  • Can perform Virtual to Physical (V2P) from owned dSources
  • Can access the same statistics as an Reader
  • Can refresh or rollback VDBs
  • Can snapshot dSources and VDBs
  • Can provision VDBs from all dSources and VDBs in the group
  • Can refresh or rollback all VDBs in the group
  • Can snapshot all dSources and VDBs in the group
  • Can perform Virtual to Physical (V2P) from owned dSources
  • Can apply Custom policies to dSources and VDBs
  • Can create Template policies for the group
  • Can assign Owner privileges for dSources and VDBs
  • Can access the same statistics as an Provisioner, Data Operator, or Reader
Data operator
  • Can access statistics on the dSource, VDB, or snapshot such as usage, history, and space consumption
  • Can refresh or rollback VDBs
  • Can access statistics on all dSources, VDBs, or snapshots in the group such as usage, history, and space consumption
  • Can refresh or rollback all VDBs in the group
delphix_adminManages all data objects: dSources, virtual databases (VDBs), users, groups, and related policies and resources, all collectively referred to as the Delphix Engine "domain." The delphix_admin user manages the Delphix Engine domain using either the browser-based Delphix Admin application or the Command Line Interface (CLI).
sysadmin userCan perform typical system administration duties such as: modifying NTP, SNMP, SMTP settings; managing storage; downloading support logs for the Delphix Engine; and performing upgrades and patches. The sysadmin user launches the initial Server Setup configuration application and has access to the Command Line Interface (CLI).



Types of Notification

TypeNotification
EventCompletion of some action in the Delphix Engine
Alert

Caused by a single event on a Delphix Engine. Also known as a System Event, and viewable through the System Event Viewer.

Alert Levels:  Informational, Warning, Critical

Fault

A persistent event on a Delphix Engine that remains until the issue is resolved. The fault may be marked resolved automatically or require that it be resolved manually.

System faults describe states and configurations that may negatively impact the functionality of the Delphix Engine and which can only be resolved through active user intervention.

Examples:  Delphix Engine storage failure, Communication failures between the Delphix Engine and a source or target environment/host

Fault Levels:  Warning, Critical

 

Useful Metrics for Monitoring the Impact of your Delphix Engine

MetricExplanation
Capacity metricsCommon metrics for a host include CPU, RAM and Disk and Network utilization. In addition to the utilization of these resources, the response times (latency) are also critical - especially for Disk and Network.
Consolidation ratioThe amount of space that dSources and VDBs occupy compared to the amount that would be occupied by a traditional physical database
Granularity
Retention ratio
RPO (Recovery Point Objective)The acceptable amount of data that can be lost in the event of a failure. For example, if backups are taken once a day, then at most 24 hours of data will be lost if the system fails immediately before a regularly scheduled backup.
RTO (Recovery Time Objective)The time required to restore the system to an operational state after a failure. For example, a recovery may require restoring data from from a backup, followed by some number of manual steps to recreate the configuration in the new system. RTO is equivalent to the downtime experienced.


Jet Stream Terms

TermExplanation
Jet Stream administrator

Has full access to all report data and can configure Jet Stream. Additionally, can use the Delphix data platform to:

  • add/delete Delphix Engines
  • add/delete reports
  • add/delete users
  • change tunable settings
  • add/delete tags
  • create and assign data templates and containers
Bookmark

A logical reference to a point in time on a branch. You can use it as a point from which to fork new branches. It can also be the target of policies – for example, you can arrange to keep this bookmark for two years.

Bookmarks are a way to mark and name a particular moment of data on a timeline. You can restore the active branch's timeline to the moment of data marked with a bookmark. You can also share bookmarks with other Jet Stream users, which allows them to restore their own active branches to the moment of data in your container. The data represented by a bookmark is protected and will not be deleted until the bookmark is deleted.

Branches

A time-ordered collection of timelines. They are task-specific groupings you can create within a data container. A branch is used to track a logical task, and contains a timeline of the historical data for that task. As you work within your data container, you can create more branches over time to run or complete separate tasks.

They represent a logical sequence of activity, separate from the underlying data lineage. This is the main concept introduced in the core platform and forms the basis of many higher level primitives. Branches:
– Can have only one timeline active at any time
– Can be user-visible (e.g. exported to a user target) or implementation (e.g. just a staging source to run a series of transformations)

Branch group / target groupA collection of multiple branches or targets that are treated as a single entity. The system can determine compatibility automatically, or a template can be used to create more complex orchestration.
Branch timelineA dynamic point-in-time interface for user actions within the branch. Common activities include re-setting data sources to run a test, refreshing the data container with the most current source data, and bookmarking data to share or track interesting moments of time along the branch timeline.
Data container

Consists of one or more data sources, such as databases, application binaries, or other application data. Allows users to:

  • Undo any changes to their application data in seconds or minutes

  • Have immediate access to any version of their data over the course of their project

  • Share their data with other people on their team, without needing to relinquish control of their own container

  • Refresh their data from production data without waiting for an overworked DBA
Data templateCreated by the Delphix administrator, data templates consist of the data sources users need in order to manage their data playground and their testing and/or development environments. Data templates serve as the parent for a set of data containers that the administrator assigns to Jet Stream users. Additionally, data templates enforce the boundaries for how data is shared. Data can only be shared directly with other users whose containers were created from the same parent data template.
Jet Stream data userJet Stream data users have access to production data provided in a data container. The data container provides these users with a playground in which to work with data using the Self-Service Toolbar.

Mission Control Terms

TermExplanation
Admin user

Admin users have full access to all report data and can configure the Mission Control appliance. For example, they can:

  • add/delete reports
  • add/delete users
  • change tunable settings
  • add/delete tags
Auditor userAuditor users can only view report data. Admin users can also assign auditor users a set of tags (arbitrary text strings) to restrict which report data they can view. There is no default auditor account. The first Delphix Administrator will need to create the auditor users and will be responsible for creating their User IDs and Passwords.
Reports

Reports present aggregated data across all connected Delphix Engines.

Interactive reports such as Storage Breakdown and History display interactive graphical representations of historical and current storage usage across all Delphix Engines you are monitoring. These visualizations of storage and disk capacity enable you to analyze and mediate storage across Delphix Engines from multiple perspectives.

TaggingYou can tag Delphix Engines in Mission Control with a set of arbitrary text strings. You can then filter reports to show only data from Delphix Engines with a certain tag. You can also use tags to restrict auditor users so that they can only view data from Delphix Engines with that tag.


Masking Terms

Engine Types

TermExplanation
Standalone Masking EngineThis Engine is deployed as an OVA (Open Virtualization Archive) in a compatible hypervisor and contains the Masking Engine GUI. From here you can create masking jobs, mask data, and administer your Masking Engine. This Engine type is suitable for Delphix installations below Delphix 5.0.
Combined Delphix Engine and Masking Engine

This Engine is built into your Delphix 5.0 and above installation. It contains both the Delphix Engine GUI and Masking Engine GUI, and allows tighter integration between Delphix's Data as a Service and Masking features.

What Goes into Masking

TermExplanation
ApplicationThe IT assets (programs, data, processes) that support a business function. For example, if a bank offers payroll services to its clients, there would be an application in its IT division to support that business.
ConnectorWhere the Delphix Engine stores JDBC database connection information. Builds a connection between the source database and the masking interface.
Domain (Masking)The domain represents the correlation between various sensitive data categories and the masking algorithm which will be applied to them
Masking environmentDefines the scope of work in the Masking Engine. A collection of masking constructs (connectors, rule sets / inventories, and jobs) that support masking for a given application environment. In order to mask databases and files within the Delphix Engine, you first need to create an environment in which the Delphix Engine will store the connection information and masking rules for those data stores. An environment can contain multiple database connections and multiple file connections. Environments are connected to applications for informational purposes.
In-place masking"Mask data in place" refers to updating a database with masked data. This includes reading data from the table defined in the rule set, masking the data in the Masking Engine, and updating the tables with the masked data.
InventoryThe Delphix Engine automatically stores the masking rules for each sensitive column in the Delphix repository database in the environment's "inventory." When you select a table to mask, its columns will appear, and you can select them for masking. Afterwards, you can edit the columns with an appropriate algorithm required for masking.
Masked VDBA virtual database with masked data
On-the-fly maskingWith on-the-fly masking, you specify the source of the information to be masked, and where the masked data will be loaded. On-the-fly masking is an Extract Transform Load (ETL) process.
Profile dataA way to identify the location of Non-Public Information (NPI) or sensitive data if you are unsure of what data needs to be masked in the first place. Profiling data is not necessary when you have already identified the sensitive data you need to mask.
Rule setPoints to a collection of tables or flat files that the Masking Engine uses for masking data. The rule set allows you to identify, select, and configure which tables you need to mask. For those tables that do not have a primary key defined, you can define a logical key with a combination of columns (or ROWID for Oracle database).
Selective Data Distribution (SDD)Permits the distribution of masked data between Delphix Engines. The sources received on a target Delphix Engine do not include  the original parent source, thereby making the original source inaccessible from the target .

Masking Algorithms

AlgorithmDescription
Secure LookupThe most commonly used type of algorithm. It is easy to generate and works with different languages. When this algorithm replaces real, sensitive data with fictional data, it is possible that it will create repeating data patterns, known as “collisions.” For example, the names “Tom” and “Peter” could both be masked as “Matt.” Because names and addresses naturally recur in real data, this mimics an actual data set. However, if you want the masking engine to mask all data into unique outputs, you should use segmented mapping, described below.
Segmented MappingProduces no overlaps or repetitions in the masked data. You can mask up to a maximum of 36 values using segmented mapping. You might use this method if you need columns with unique values, such as Social Security Numbers, primary key columns, or foreign key columns. You can set the algorithm to produce alphanumeric results (letters and numbers) or only numbers.
Mapping

Allows you to state what values will replace the original data. There will be no collisions in the masked data, because it always matches the same input to the same output. For example “David” will always become “Ragu,” and “Melissa” will always become “Jasmine.” The algorithm checks whether an input has already been mapped; if so, the algorithm changes the data to its designated output. You can use a mapping algorithm on any set of values, of any length, but you must know how many values you plan to mask.


NOTE: When you use a mapping algorithm, you cannot mask more than one table at a time. You must mask tables serially.
Binary LookupReplaces objects that appear in object columns. For example, if a bank has an object column that stores images of checks, you can use a binary lookup algorithm to mask those images. The Delphix Engine cannot change data within images themselves, such as the names on X-rays or driver’s licenses. However, you can replace all such images with a new, fictional image. This fictional image is provided by the owner of the original data.
Tokenization

The only type of algorithm that allows you to reverse its masking. For example, you can use a tokenization algorithm to mask data before you send it to an external vendor for analysis. The vendor can then identify accounts that need attention without having any access to the original, sensitive data. Once you have the vendor’s feedback, you can reverse the masking and take action on the appropriate accounts.

Like mapping, a tokenization algorithm creates a unique token for each input such as “David” or “Melissa.” The Delphix Engine stores both the token and the original so that you can reverse masking later.

Min MaxValues that are extremely high or low in certain categories allow viewers to infer someone’s identity, even if their name has been masked. For example, a salary of $1 suggests a company’s CEO, and some age ranges suggest higher insurance risk. You can use a min max algorithm to move all values of this kind into the midrange.
Data CleansingDoes not perform any masking. Instead, it standardizes varied spellings, misspellings, and abbreviations for the same name. For example, “Ariz,” “Az,” and “Arizona” can all be cleansed to “AZ.”
Free Text Redaction

Helps you remove sensitive data that appears in free-text columns such as “Notes.” This type of algorithm requires some expertise to use, because you must set it to recognize sensitive data within a block of text.

One challenge is that individual words might not be sensitive on their own, but together they can be. The algorithm uses profiler sets to determine what information it needs to mask. You can decide which expressions the algorithm uses to search for material such as addresses. For example, you can set the algorithm to look for “St,” “Cir,” “Blvd,” and other words that suggest an address. You can also use pattern matching to identify potentially sensitive information. For example, a number that takes the form 123-45-6789 is likely to be a Social Security Number.

You can use a free text redaction algorithm to show or hide information by displaying either a “black list” or a “white list.”