Checking the health of the node

Hi, We would like to put our node behind a LoadBalancer, but for this, we need to be able to check the health of the node via an HTTP API endpoint.
Is there a way to invoke methods of the grpc protocol using an URL? Particularly we would like to invoke the check method.
Additionally, in the documentation there are no details about how the NodeHealthRequest should look like (Protocol Documentation)…

Hi,

I’m not sure that it’s possible to use the GRPC protocol directly as you describe, but depending on the load balancer you are using, you may be able to use GRPC health checks. (For instance NGINX and GCLB support this.)

In the API you link to, the NodeHealthRequest is just an empty message. The node also supports the standard GRPC health checking protocol, which is what most load balancers will likely expect. (Either leave the service empty or specify “concordium.v2.Queries” – both are equivalent.)

Is there a specific load balancer that you are using?

Hi,

We checked and yes, our LoadBalancer allow us to make gRPC calls.

We also checked the health of the node using Concordium’s SDK and we were able to do it for the testnet RPC (node.testnet.concordium.com/), but it failed for Concordium’s mainnet RPC (grpc.mainnet.concordium.software) and for our mainnet RPC.

Is there any configuration we need to set to enable the healtcheck? We used the suggested standard configuration in the configuration

Regards

Hello.

The healthcheck is an integral part of the node and will be present.

So a few things you can look into:

Does the healthcheck on the node actually provide a valid response code? Something like: grpcurl -plaintext -d '{}' localhost:20000 concordium.health.Health/Check and then you will be able to read the empty object and zero exit code given successful response.

Given no then the issue should be reported in the logs. If you host it using systemctl you could do something like journalctl -u <SERVICE_NAME>.service

Given the health check does indeed provide a valid response code then the issue would lie in the configuration of the loadbalancer. I can probably also help you there. I would though need a cloud provider as well as a loadbalancer type.

One difference between node.testnet.concordium.com and grpc.mainnet.concordium.com is that the latter has TLS enabled, so you will need to configure the load balancer to use that. (e.g. in NGINX, specify grpcs://grpc.mainnet.concordium.com.)

We are doing some PoC before actually configuring on the LB.
We are testing using the SDK, so really not sure of the actual response.
The error we are getting from the SDK is the following:

RpcError: read ECONNRESET
      at Object.callback (node_modules/@protobuf-ts/grpc-transport/build/commonjs/grpc-transport.js:37:27)
      at Object.onReceiveStatus (node_modules/@grpc/grpc-js/src/client.ts:360:26)
      at Object.onReceiveStatus (node_modules/@grpc/grpc-js/src/client-interceptors.ts:458:34)
      at Object.onReceiveStatus (node_modules/@grpc/grpc-js/src/client-interceptors.ts:419:48)
      at /Users/dzariusz/projects/umbrella/pegasus/node_modules/@grpc/grpc-js/src/resolving-call.ts:132:24
      at processTicksAndRejections (node:internal/process/task_queues:77:11)

Could you please provide a bit more context on what exactly you are doing that leads to this error?

1 Like

I will complete this thread on behalf of @ed-umb

Please correct me if I am wrong:

As it turns out while copy&pasting the download url into the console it replaced _ for - and therefore the url was malformed.
Everything should be fine now. The indexer is running smoothly with a version 6.3.0 up-to-date node doing the heavy lifting of registrering data on concordium