Skip to main content

Troubleshooting Guide

This guide helps you diagnose and resolve common issues with ChainLaunch.

Common Issues

Node Won't Start

Symptoms:

  • Node status remains "Stopped" after clicking start
  • Error message in logs

Diagnosis:

  1. Check node logs: Go to the nodes list, and enter the node that is failing. Then at the bottom check the logs.

  2. Check system resources:

    • Verify CPU and memory available
    • Check disk space: df -h
  3. Check port availability:

    # Linux/macOS
    lsof -i :30303 # P2P port
    lsof -i :8545 # JSON-RPC port

Solutions:

Error MessageSolution
Port already in useChange node port in configuration or kill process using port
Insufficient disk spaceFree up disk space or mount larger volume
Permission deniedCheck file permissions on node data directory
Connection refusedCheck firewall rules and network connectivity
Out of memoryIncrease memory allocation or reduce node count

Nodes Not Discovering Each Other

Symptoms:

  • Block height not increasing
  • Peer count = 0
  • Consensus not starting (Fabric/Besu)

Diagnosis:

  1. Check peer count:

    # For Besu via RPC
    curl -X POST http://localhost:8545 \
    -H "Content-Type: application/json" \
    -d '{
    "jsonrpc": "2.0",
    "method": "net_peerCount",
    "params": [],
    "id": 1
    }'
  2. Check network connectivity between nodes:

    # Test if node A can reach node B
    ping node-b-ip
    telnet node-b-ip 30303
  3. Check enode URLs (Besu):

    # Get node's enode
    curl -X POST http://localhost:8545 \
    -H "Content-Type: application/json" \
    -d '{
    "jsonrpc": "2.0",
    "method": "admin_nodeInfo",
    "params": [],
    "id": 1
    }'

Solutions:

IssueSolution
Firewall blocking portsOpen P2P port (30303) and RPC port (8545) in firewall
Nodes on different networksVerify genesis block hash matches across nodes
Bootnode not runningStart bootnode and configure its address in other nodes
DNS not resolvingUse IP addresses instead of hostnames
Network policy restricting trafficReview Kubernetes network policies or security groups

Backup Issues

Symptoms:

  • Backup job fails with "connection refused" or "access denied"
  • Scheduled backups not running
  • Restore operation fails

Diagnosis:

  1. Check backup target configuration:

    curl http://localhost:8100/api/v1/backups/targets
  2. Verify S3 connectivity:

    # Test S3 access with AWS CLI
    aws s3 ls s3://your-backup-bucket/
  3. Check backup status:

    curl http://localhost:8100/api/v1/backups

Solutions:

IssueSolution
Access Denied on S3Verify IAM credentials and bucket policy allow read/write access
NoSuchBucketEnsure the S3 bucket exists and the region is correct
RequestTimeTooSkewedSync the system clock (ntpdate or timedatectl)
Scheduled backup not triggeredVerify the backup schedule is active and the cron expression is valid
Restore fails with checksum errorThe backup may be corrupted; try restoring from a different snapshot
credential not foundRe-configure the backup target with valid credentials (static, instance role, or named profile)

Monitoring Issues

Symptoms:

  • Prometheus metrics endpoint not responding
  • Grafana dashboards show "No data"
  • Node metrics not updating

Diagnosis:

  1. Check if monitoring is enabled:

    curl http://localhost:8100/api/v1/settings
  2. Test the Prometheus metrics endpoint directly:

    curl http://localhost:9090/metrics
  3. Verify node metrics are exposed:

    # For Besu nodes
    curl http://localhost:9545/metrics

    # For Fabric peers
    curl http://localhost:9443/metrics

Solutions:

IssueSolution
Prometheus not startingCheck that port 9090 is not in use; verify Prometheus configuration file is valid
Metrics endpoint returns 404Ensure monitoring was enabled when creating the node
Dashboards show "No data"Verify Prometheus is scraping the correct targets; check the time range in Grafana
High cardinality warningsReduce the number of custom labels or increase Prometheus memory
Metrics stop updatingRestart the node; check if the process is still running

Authentication Issues

Symptoms:

  • Login returns 401 Unauthorized
  • Session expires unexpectedly
  • API key not accepted

Diagnosis:

  1. Test authentication:

    # Basic auth
    curl -u admin:password http://localhost:8100/api/v1/nodes

    # API key
    curl -H "X-API-Key: clpro_..." http://localhost:8100/api/v1/nodes
  2. Check server logs for auth errors: Look for authentication failed or token expired messages in the ChainLaunch server logs.

Solutions:

IssueSolution
401 Unauthorized with correct credentialsEnsure you are using the correct authentication method (Basic Auth vs API Key)
Session expires too quicklyCheck session timeout configuration in settings
API key rejectedVerify the key has not been revoked; regenerate if necessary
OIDC login failsVerify the OIDC provider configuration (issuer URL, client ID, client secret)
403 ForbiddenThe authenticated user lacks the required RBAC role (ADMIN, OPERATOR, or VIEWER) for the requested resource

Besu Node Issues

Symptoms:

  • Besu validator not producing blocks
  • Consensus stalled across the network
  • Peers not connecting to the network

Diagnosis:

  1. Check peer count:

    curl -X POST http://localhost:8545 \
    -H "Content-Type: application/json" \
    -d '{
    "jsonrpc": "2.0",
    "method": "net_peerCount",
    "params": [],
    "id": 1
    }'
  2. Check sync status:

    curl -X POST http://localhost:8545 \
    -H "Content-Type: application/json" \
    -d '{
    "jsonrpc": "2.0",
    "method": "eth_syncing",
    "params": [],
    "id": 1
    }'
  3. Check latest block number:

    curl -X POST http://localhost:8545 \
    -H "Content-Type: application/json" \
    -d '{
    "jsonrpc": "2.0",
    "method": "eth_blockNumber",
    "params": [],
    "id": 1
    }'
  4. Check node status via ChainLaunch API:

    curl http://localhost:8100/api/v1/nodes

Solutions:

IssueSolution
Validator not producing blocksEnsure the validator key is correctly configured and the node is part of the validator set
Consensus stalledCheck that a majority of validators are online; IBFT 2.0 / QBFT requires 2/3+1 validators
Genesis block mismatchAll nodes must use the same genesis file; re-initialize nodes with the correct genesis
Peer discovery not workingVerify the bootnode enode URL is correct and reachable; check that the P2P port (30303) is open
Invalid block errorsCheck that all validators are running the same Besu version
High memory usageIncrease the JVM heap size (-Xmx) or enable pruning to reduce state storage
RPC endpoint not respondingVerify RPC is enabled (--rpc-http-enabled) and the host/port settings are correct

Fabric-X Issues

Fabric-X has a few platform-specific failure modes that don't apply to classic Fabric. The Fabric-X Quickstart troubleshooting table covers the most common ones; here's the summary:

SymptomLikely causeFix
Quickstart phase 5 (join) times out on the first nodeCold Docker Desktop bind-mount cacheRetry the failing node individually — subsequent ones will be fast once the cache is warm. The default per-node timeout is 240s.
dial ... context deadline exceeded on namespace createFabric-X local-dev mode not enabled (macOS / Windows Docker Desktop)Set CHAINLAUNCH_FABRICX_LOCAL_DEV=true on the server process and recreate the network — addresses get rewritten to host.docker.internal so containers can dial the host.
invalid mount config ... bind source path does not existCold Docker Desktop bind-mount cacheSame as the join timeout — retry.
Port already in use on joinAnother Fabric-X network or service on the same hostRun with a different --base-port band, or --clean to wipe the prior bundle.
Network status stuck at genesis_block_createdNormal — the network row's status field doesn't transition to ACTIVE. Container readiness is reported on each node row instead.No action needed. Check Nodes filtered by platform FABRICX for per-node status.
Stale TLS certs after re-running --cleanBind-mount data outlives the containerRe-run with --data-path /path/to/server/data so --clean also purges fabricx-orderers/ and fabricx-committers/ directories.

For deeper Fabric-X diagnostics see Fabric-X Architecture (port layout and component data flow) and Fabric-X Monitoring (Prometheus /metrics endpoints per role).

See Also