Overview
This topic describes the purpose and function of system faults.
System faults describe states and configurations that may negatively impact the functionality of the Delphix Engine which can only be resolved through active user intervention. When you login to the Delphix Admin application as a delphix_admin, the number of outstanding system faults appears on the right-hand side of the navigation bar at the top of the screen. Faults serve as a record of all issues impacting the Delphix Engine and can never be deleted. However, ignored and resolved faults are not displayed in the faults list.
System Faults indicator in the navigation bar
Delphix Object Based Environment Monitor Faults
The environment monitor previously only created faults for "hosts" and "sources." There are several faults which more logically apply to other Delphix objects, such as repositories, which are DB install files. Posting them against sources results in fault duplication. The environment monitor now posts faults against -- and re-associates the offending faults with -- the correct objects. Consequently, users see fewer errors that are easier to diagnose.
Viewing Faults
- In the top navigation bar, click Faults.
- Click any fault in the list to expand it and see its details.
Each fault comprises six parts:
- Severity – How much of an impact the fault will have on the system. A fault can have a severity of either Warning or Critical.
- A Warning Fault implies that the system can continue despite the fault but may not perform optimally in all scenarios.
- A Critical Fault describes an issue that breaks certain functionality and must be resolved before some or all functions of the Delphix Engine can be performed.
- Date – The date that the Delphix Engine diagnosed the fault.
- Target Object – The object against which the fault was posted. Faults will be posted against the host for incorrect environment configurations, sources for problems with the database, and repositories for issues with the installation.
- Title – A short descriptive summary of the fault
- Details – A detailed summary of the cause of the fault
- User Action – The action you can take to resolve the fault
Parts of each system fault
Addressing Faults
- The fault is caused by a well-understood issue that cannot be changed
- Its impact to the Delphix Engine is well understood and acceptable
In this case, the fault will not be re-diagnosed even if the fault condition persists. You will receive no further notifications.
To address a fault follow the steps below.
- In the top menu bar, click Faults.
- In the list of faults, click a fault date/name to view the fault details.
- If the fault condition has been resolved, click Mark Resolved.
Note that if the fault condition persists it will be detected in the future and re-diagnosed. - If the fault condition describes a configuration with well-understood impact to the Delphix Engine that cannot be changed, you can ignore the fault by clicking Ignore.
Note that an ignored fault will not be diagnosed again even if the underlying condition persists.
By default, when a critical or warning fault occurs, the Delphix Engine immediately sends an email to the delphix_admin. Make sure you have configured an SMTP server and defined an appropriate email address for delphix_admin. See Setting Up the Delphix Engine for more information.
By default, emails will also be sent for critical or warning alerts (aka events). You can modify the default behavior by changing the alert profile with the CLI. See the CLI Cookbook Creating Alert Profiles for more information.
Fault Lifecycle Example