This topic describes how to design, build and install a data management toolkit.
Introduction
A toolkit is basically a plugin that allows Delphix to work with a particular data platform. Sometimes, a Delphix operation will require special coordination with the data platform. The toolkit's job is to handle this coordination.
A toolkit has three main tasks:
- To gather information about an installation of a data platform on a particular environment.
- To coordinate syncing of data from a data source.
- To coordinate provisioning of data to a target environment.
A toolkit is made up of a set of scripts, plus some additional data specifying various properties of the toolkit.
Throughout this documentation, we will refer to a made-up data platform called DelphixDB. This tutorial will walk you through the steps of creating a data management toolkit for Delphix DB, including the design, implementation and installation processes.
Data Management Design
Before writing a data management toolkit, you should fully understand the format of your data and the constraints on it. Typically, this process requires answering at least the following questions:
- On what types of environments does your data live?
- For example, Unix, Windows, or both?
- What is the "root point of capture" for your data on a source environment?
For example, is the root point of capture the value of the ORACLE_HOME environment variable? Is it in a well-defined location like
/usr/local/pgsql/data
? Will you need to ask the user where the data lives?
- Does access to your data depend on other data or tools?
- For example, the Oracle binaries are required to capture data from an Oracle database. Therefore, the Delphix Engine assumes access to
RMAN
andsqlplus
.
These binaries are also required to provision Oracle database data to an environment because the Delphix Engine assumes it can automate recovery of data files and open the database.
- For example, the Oracle binaries are required to capture data from an Oracle database. Therefore, the Delphix Engine assumes access to
- How can data be captured from the source environment?
- For example, can this data simply be captured using
rsync
orrobocopy
? Or is a more sophisticated approach necessary to pull in data?
- For example, can this data simply be captured using
- Is it necessary to quiesce data so that it can be captured?
- For example, do running processes need to be stopped? Do buffers need to be flushed to ensure data consistency?
- What monitoring will the data capture process need? What monitoring will provisioned copies of data need?
- For example, should we monitor the health of a native replication processes used to capture data? What is the definition of a "running" virtual copy of data?
- In addition to supplying a "root point of capture" for data on a source environment, what parameters must users supply to customize the management of virtual data?
- For example, does the Delphix Engine need a password to access or configure the data?
- What is necessary to configure data when it is copied?
- For example, do you need to run any post-clone configuration to ensure the data is usable?
- Is it meaningful to create a dataset "from scratch" in Delphix?
- Or must any Delphix-managed dataset derive from a pre-existing external dataset?
Building the Toolkit
Once you have a firm understanding of your data, you can proceed with building the toolkit.
To get started, make a new directory to act as the root of all toolkit files you'll create throughout this tutorial. Then, follow along the links below.
Writing Toolkit Code
Introduction
There are three types of things that must be written by the toolkit author: data descriptions, code that runs on the Delphix Engine, and code that runs outside of the Delphix Engine.
Data descriptions are all provided together in a single JSON file in the top-level directory of the toolkit.
Code that runs on the Delphix Engine is written in Lua, and is provided by a number of different "hook" files, as described in the links below.
All but the very simplest toolkits will need to run code on the source and target environments (that is, not on the Delphix Engine). This code is called as needed from the Lua hooks. The vast majority of this code will be written as Bash scripts for Unix platforms, and as Powershell scripts for Windows platforms.
Details
Toolkit Metadata: Writing a main.json
The main.json
file outlines a toolkit's type and name, the set of parameters users must fill in when capturing or provisioning data, and schemas for discovering data.
Build A Toolkit: Discovering Data Sources and Dependencies
Discovery is the process by which data sources and dependencies are identified on environments.
Build a Toolkit: Linking Data Sources
Linking is the process by which data is imported from discovered data sources into the Delphix Engine.
Build a Toolkit: Provisioning Virtual Data Sources
Provisioning is the process by which virtual copies of data are made available via target environments.
Build a Toolkit: Versioning and Upgrading
This page documents the rules for toolkit versioning, and how to provide upgrade logic.
Build a Toolkit: Other Topics
This page documents topics of interest to toolkit writers that are not specific to discovery, linking, or provisioning.
Building the Toolkit File
All of the toolkit contents mentioned above have to be packaged into a single JSON file in order to upload it onto a Delphix Engine. This is done using the build-toolkit.py script provided by the Toolkit DevKit. This script assumes you have a "base directory" in which you have separate subdirectories for one or more separate toolkits. Here is an example of building a single file from a toolkit defined in a subdirectory named sample
.
build-toolkit.py /path/to/my/base/dir sample sample-toolkit.json
Installing the Toolkit
Upload a data management toolkit using the upload-toolkit.py
script included in the Toolkit DevKit.
upload-toolkit.py sample-toolkit.json my-delphix-engine.delphix.com
Updating an Installed Toolkit
The Delphix Engine uses a toolkit's name
to uniquely identify it. Uploading a toolkit with the same name
as a toolkit that is already installed will overwrite the toolkit on the Delphix Engine.
To update a toolkit, upload a new toolkit with a name
that matches a previously installed toolkit. Updates to a toolkit will take effect immediately. You do not need to restart the Delphix Engine.
If the set of parameters in the toolkit's main.json
has changed, the following rules are used to update parameter values:
- Any
ToolkitParameters
with values that are already filled in will keep their values after the update. - Any
ToolkitParameters
introduced in the update will take on their default values until you set them explicitly. - Any
ToolkitParameters
whosename
properties have been changed will be treated as having been deleted and added again.
If the discovery definition's sourceConfigSchema
or repositorySchema
have changed, you will not be allowed to update the toolkit. Instead, you must delete an old version of the toolkit from the Delphix Engine and reinstall the toolkit as new.
Deleting an Installed Toolkit
You can delete a toolkit through the command line interface (CLI).
- Login to the CLI.
- cd toolkit
- Select <toolkit name> where <toolkit name> is the name of the toolkit to be deleted.
- delete
- commit