Version: Next

Health check

To monitor liveness and readiness of your node, Nethermind provides a simple yet powerful health check feature. It is available at the default /health endpoint of the JSON-RPC server.

Basic configuration

Important

The health check service requires the JSON-RPC API to be enabled.

The health check service is disabled by default. To enable it, set the HealthChecks.Enabled configuration option as follows:

nethermind \
  -c mainnet \
  --data-dir path/to/data/dir \
  --healthchecks-enabled

Once Nethermind is up and running, the health check service can be accessed at the /health endpoint:

curl localhost:8545/health

with a response similar to the following if healthy:

HTTP 200 OK
{
  "status": "Healthy",
  "totalDuration": "00:00:00.0006931",
  "entries": {
    "node-health": {
      "data": {
        "IsSyncing": false,
        "Errors": []
      },
      "description": "The node is now fully synced with a network. Peers: 89.",
      "duration": "00:00:00.0003797",
      "status": "Healthy",
      "tags": []
    }
  }
}

or similar to the following if unhealthy:

HTTP 503 Service Unavailable
{
  "status": "Unhealthy",
  "totalDuration": "00:00:00.0009477",
  "entries": {
    "node-health": {
      "data": {
        "IsSyncing": false,
        "Errors": [ "NoPeers" ]
      },
      "description": "The node is now fully synced with a network. Node is not connected to any peers.",
      "duration": "00:00:00.0001356",
      "status": "Unhealthy",
      "tags": []
    }
  }
}

It is also possible to replace the default /health endpoint with a custom one using the HealthChecks.Slug configuration option. For example:

--healthchecks-slug /my/custom/endpoint

Configuring a webhook

The health check service can be configured to send notifications to a webhook on node failure or recovery. This is achieved with the HealthChecks.UIEnabled, HealthChecks.WebhooksEnabled, and HealthChecks.WebhooksUri configuration options. Optionally, the webhook payload data can be customized with the HealthChecks.WebhooksPayload and HealthChecks.WebhooksRestorePayload configuration options for failure and recovery events respectively.

The following example demonstrates how to configure a basic Slack webhook:

nethermind \
  -c mainnet \
  --data-dir path/to/data/dir \
  --healthchecks-enabled \
  --healthchecks-uienabled \
  --healthchecks-webhooksenabled \
  --healthchecks-webhooksuri https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX \
  --healthchecks-webhookspayload '{"text": "Node is unhealthy"}' \
  --healthchecks-webhooksrestorepayload '{"text": "Node is healthy"}'

Monitoring storage space

Monitoring the available storage space is a crucial aspect of running a node. Nethermind provides a feature to track the free storage space and take action when the available space falls below a certain threshold. The following options are available:

HealthChecks.LowStorageCheckAwaitOnStartup to check for low disk space on startup and suspend Nethermind until enough space is available
HealthChecks.LowStorageSpaceShutdownThreshold to shut down Nethermind when the percentage of available disk space falls below the specified threshold
HealthChecks.LowStorageSpaceWarningThreshold to issue a warning when the percentage of available disk space falls below the specified threshold

Monitoring blocks

Another critical aspect of running a node is monitoring the production and processing of blocks. For that, Nethermind provides the following options:

HealthChecks.MaxIntervalWithoutProcessedBlock to specify the max interval without processing a block before the node is considered unhealthy
HealthChecks.MaxIntervalWithoutProducedBlock to specify the max interval without producing a block before the node is considered unhealthy

Monitoring consensus client

The health check service can also monitor the communication between Nethermind and the consensus client which can be configured by the HealthChecks.MaxIntervalClRequestTime configuration option.

Basic configuration​

Configuring a webhook​

Monitoring storage space​

Monitoring blocks​

Monitoring consensus client​

Basic configuration

Configuring a webhook

Monitoring storage space

Monitoring blocks

Monitoring consensus client