This new feature enables self-monitoring/diagnosability of Delphix Engines by providing native integration with Splunk Enterprise. By providing details about your Splunk instance, you can allow Delphix Engine(s) to automatically send structured JSON logs to Splunk that capture activity on those Delphix engine(s). These logs include Delphix events (Actions, Job Events, Faults, and Alerts) as well as performance metrics (CPU, disk, network, TCP, dataset, NFS, iSCSI) and capacity metrics. This feature enables extensible search and visualization of actionable information and provides a centralized, comprehensive view of Delphix activity (including the ability to cross-reference information from multiple Delphix engines) on a platform that allows building your own operational intelligence for your Delphix installation.

Prerequisites

Before you configure the Delphix Engine you will need to configure and make a note of the following in Splunk:


Please refer to the Splunk documentation for detailed steps on how to configure your values.


  1. In the Splunk web UI Enable SSL (this is optional but best practice for security) in your global HTTP Event Collector (HEC) settings.

  2. The Splunk hostname or IP Address.

  3. The HEC Port number for your Splunk instance (default 8088).

  4. Enable the HTTP Event Collector on Splunk, and create a new HEC Token with a new Splunk index set as an allowed index for the token. Make sure Enable Indexer Acknowledgement is unchecked for the token.


    If you wish, you can use a separate Splunk index for performance and capacity metrics (otherwise, the same index will be used for both events and metrics). If you are using Splunk 7.0+, it is recommended that you create this second index as a special “Metrics” type index that is optimized for indexing and searching metrics data.


    Note the HEC Token Value and the Allowed Indexes for the token.

Configuring Delphix for Splunk

  1. Log in to the Delphix Server Setup UI as the sysadmin.

  2. From the Preferences menu select Splunk Configuration.

  3. In the Splunk, Configuration window, enter your Splunk values. 
    To reduce the volume of data that will be sent to Splunk, you can optionally uncheck Enable Metrics.


    Host

    Splunk hostname or IP address

    HEC Port

    The TCP port number for the Splunk HTTP Event Collector (HEC)

    HEC Token

    The token for the Splunk HTTP Event Collector (HEC)

    Main Index

    The Splunk Index events will be sent to. Must be set as an allowed index for the HEC token.

    Events Push Frequency

    The frequency at which the Events will be pushed to Splunk. Specified in seconds.

    Enable SSL

    Whether to use HTTPS to connect to Splunk. Must match your HTTP Event Collector settings in Splunk.

    Metrics Index

    The Splunk Index metrics will be sent to. If none is specified then the Main Index will be used for metrics as well. Must be set as an allowed index for the HEC token.

    Metrics Push Frequency

    The frequency at which the Performance Metrics will be pushed to Splunk. Specified in seconds

    Performance Data Granularity

    The resolution of performance metrics data sent to Splunk. This controls how frequently snapshots of system performance data are taken.

  4. Click Send Test Data to verify your provided values.
    This will send a test event to the provided token and indexes.

  5. Click Save to enable the Splunk configuration and begin sending all new Actions, Job Events, Faults, Alerts, and Metrics to your Splunk instance.

Using Search

Use the search to analyze your data and enumerate items in a metrics index. For more about searching a metrics index, refer to the Splunk documentation.

Search Examples - Metrics

The following examples provide information on viewing Metrics on Splunk 7.x

To get a list of all Metrics:

| mcatalog values(metric_name)

To get a list of all dimensions of a given metric - say CPU utilization percentage:


| mcatalog values(_dims) where metric_name="system.cpu.util.pct"

To view the average values of overall CPU utilization percentage across all hosts with a span of 30 seconds:


| mstats avg(_value) WHERE index=delphix_metrics AND metric_name=system.cpu.util.pct span=30s

You can also display results in a chart with CPU wildcard:


| mstats perc85(_value) AS val85 avg(_value) AS val where metric_name="system.cpu.*" span=1s by data.kernel, data.user, data.idle
| eval total='data.kernel' + 'data.user' + 'data.idle'
| eval sys_pct=(('data.kernel'/total) * 100) 
| eval usr_pct=(('data.user'/total) * 100) 
| eval idle_pct=(('data.idle'/total) * 100) 
| timechart span=10m avg(val) as "cpu.overall", avg(val85) as "cpu.overall 85th Percentile", avg(sys_pct) as "cpu.system", avg(usr_pct) as "cpu.user", avg(idle_pct) as "cpu.idle"


This type of search can be used to stack different CPU metrics that add up to 100%. Here is a sample screenshot of the above “stack different CPU metrics” from the Delphix Engines.


Related Links