Node Troubleshooting 101
Introduction
Hi all,
Time and time again I am asked the same questions over and over, so thought that it might be about time I posted some basic troubleshooting tips that will need to be ran to get to the bottom of the issue your node is experiencing.
It may not be apparent in the early stages of your node operating days, but eventually you will be able to recognise what you need to do to help us in the Discord channel or at the Official HoriZEN support website so that we can support you better.
"Cert valid: false not found"
What it means
This indicates that you are not looking at the details page for your node on its home server or the server it is currently connected to. Say for instance your node is saying it is currently connected to ts1.eu but your are looking at the ts4.na tracking server, like in the following image, it would reflect that your node "Cert valid: false not found"
However, if you change the URL to the tracking server your node is connected to (either it's home server or the current server) the details page it will say "Not Checked", as can be seen in the following image
What causes it
The tracking server validates the certs in batches and "Not checked" just means that the tracking server has not reached the batch that your node is in for validation. You may have had a brief transient disconnection or the node was rebooted and the tracker server needs to re-validate your cert.
What do you need to do?
If there is no open certificate exception then there is nothing you need to do, but if there is an open certificate exception then you will need to investigate the cause.
Continuous sys downtime exceptions
What it means
"sys" exceptions mean that for a period of time the tracking server was not communicating with the tracker on your node.
What causes it
There could be a number of reasons why you are seeing this exception, one could be that your node is using a nodejs version that is unstable with the tracker, your VPS instance or local server is having network connectivity issues, or that your tracker has stalled.
What do you need to do?
Verify that your nodejs version is at the last known working version. Please refer to this blog post to check and resolve your nodejs version
If your nodejs version is < v10.12 then we need to look into the tracker logs. A good place to start is running a query on the log for a time stamp roughly 1 minute before the reported down time. Lets say that you have a start downtime stamp listed on the tracker website as "2018-11-07 01:17:50", this is a time stamp of the downtime in UTC. So you could execute a command on your node sudo journalctl -u zentracker --since "2018-11-07 01:15:50"
which is 2 minutes before the start of the downtime. If you don't see log entries that correspond to the UTC time you may need to convert the UTC to your nodes timezone. To get your nodes timezone please run date
.
If your tracker has stalled you can issue sudo systemctl restart zentracker && sudo journalctl -fu zentracker
to restart your tracker and immediately load the log so you can see what is happening.
Challenge Exception: "18: bad-txns-joinsplit-requirements-not-met"
What it means
This error means that the previous challenge did not complete successfully due to reason "18: bad-txns-joinsplit-requirements-not-met".
What causes it
This primarily occurs when your last challenge (shielded transaction) was included in an orphaned block.
What do you need to do?
Unfortunately there is no way to prevent this for occurring, but if you see this you can resolve it by running:
sudo systemctl stop zend zentracker && sleep 4 && zend -rescan
This will rescan the blockchain for all transactions related to addresses in your wallet.
You can run watch -n 5 zen-cli getinfo
which will poll the zend daemon every 5 seconds. You will see an output like the following:
error code: -28
error message:
Rescanning...
Once you see a change from that output to something like the following:
{
"version": 2001550,
"protocolversion": 170002,
"walletversion": 60000,
"balance": 0.00000000,
"blocks": 411372,
"timeoffset": 0,
"connections": 10,
"proxy": "127.0.0.1:9050",
"difficulty": 3424935.245685966,
"testnet": false,
"keypoololdest": 1507873934,
"keypoolsize": 101,
"paytxfee": 0.00000000,
"relayfee": 0.00000100,
"errors": ""
}
Press control c
and run the following:
sudo systemctl start zend zentracker && sudo journalctl -fu zentracker
This command will restart the zend and tracker services and then immediately open the log file so you can see what is happening. More than likely you will see a challenge start once the tracker connects back to the tracking server to try and resolve your open chal exception.
Challenge Exception: "allowed run time exceeded"
What it means
This means that the current system resource configuration for your node is no able to perform the challenge in the allowed time.
What causes it
If your node was previously passing the challenges and now it is not, it could be due to a number of things. This could be due to but not limited to over provisioning of the VPS host hardware or insufficient resource allocation to your instance.
What do you need to do?
Increasing your physical RAM should and in most cases improve the challenge time.
This is a basic list of some common node problems, should you have any others that you would like investigated post a comment and I will endevour to update this blog post.