Core Channel Node Troubleshooting
This guide covers common issues you might encounter when setting up or running an Aleph Cloud Core Channel Node (CCN), along with solutions to resolve them.
Installation Issues
Docker Installation Problems
If you encounter issues installing Docker or Docker Compose:
# Check if Docker is running
systemctl status docker
# Install Docker from official repository if the package manager version fails
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
Permission Issues
If you encounter permission errors when running Docker commands:
# Add your user to the Docker group
sudo usermod -a -G docker $(whoami)
# Apply the new group membership (requires logging out and back in)
# Alternatively, you can use this command to apply changes in the current session
newgrp docker
Configuration Errors
If your node fails to start due to configuration issues:
Check for syntax errors in your
config.yml
file:bashyamllint config.yml
Verify the Ethereum API URL is correct and accessible:
bashcurl -s <your-ethereum-api-url>
Ensure your node keys were generated correctly:
bashls -la keys/ # Should show: node-pub.key and node-secret.pkcs8.der
Docker Resource Management
Checking Disk Usage
Docker containers and images can consume significant disk space over time:
# Check Docker disk usage
docker system df
# Detailed breakdown of space used by containers
docker system df -v
Cleaning Unused Docker Resources
Free up disk space by removing unused Docker resources:
# Remove stopped containers
docker container prune -f
# Remove unused images
docker image prune -a -f
# Remove unused volumes
# Ensure to have ALL containers running to avoid cleaning used volumes
docker volume prune -f
# Remove unused networks
docker network prune -f
# Full system prune (combines all the above)
docker system prune -a -f
Setting Up Log Rotation
Prevent Docker logs from consuming too much disk space:
Create or edit
/etc/docker/daemon.json
:json{ "log-driver": "json-file", "log-opts": { "max-size": "10m", "max-file": "3" } }
Restart Docker to apply changes:
bashsudo systemctl restart docker
IPFS Container Resource Management
The IPFS container can sometimes consume excessive system resources, which may impact the overall performance of your node or server. Resource limiting is typically only necessary if you observe high CPU or memory usage that affects system stability.
When to Consider Resource Limiting
You should consider implementing resource constraints for the IPFS container if:
- System monitoring shows the IPFS container regularly using more than 70-80% of available CPU
- The server becomes unresponsive or significantly slows down during peak times
- Other critical services on the same host are starved of resources
To check resource usage:
# Monitor container resource usage in real-time
docker stats
# Or check with top and filter for ipfs processes
top -c | grep ipfs
Limiting Memory and CPU Usage
If you determine that resource limiting is necessary, modify your docker-compose.yml
file to add resource constraints for the IPFS container using the older Docker Compose syntax:
services:
ipfs:
# ... other configurations ...
environment:
- IPFS_PROFILE=server
- GOMAXPROCS=4 # 50% of total CPU cores amount
- GOMEMLIMIT=23500MiB # 25% of total RAM memory minus 500MB
# ... other configurations ...
command: [... command specification ...]
cpus: 4.0 # 50% of total CPU cores amount
mem_limit: 24G # 25% of total RAM memory
memswap_limit: 24G # Same amount than up
The above settings:
- Limit CPU cores to 4, needs to be customized to be around 50% of host CPU cores.
- Limit maximum memory usage to 24GB, needs to be customized to be around 25% of total host memory.
- Limit maximum swap memory usage to 24GB, ideally should be the same amount as memory limit.
Manual IPFS Garbage Collection
If your IPFS container is consuming excessive disk space, or you need to manually trigger garbage collection outside of the scheduled period, you can run garbage collection manually using Docker commands.
When to Run Manual Garbage Collection
Consider running manual garbage collection when:
- Disk space is critically low
- IPFS data directory is growing unexpectedly large
- You notice performance degradation
- After bulk operations that created many temporary objects
Running Garbage Collection
To find your IPFS container name:
# List running containers to find the IPFS container
docker ps | grep ipfs
# Alternatively, if using docker-compose
docker-compose ps
To manually trigger IPFS garbage collection:
# Check current IPFS repo size before cleanup
docker exec -it <ipfs-container-name> ipfs repo stat
# Run garbage collection manually
docker exec -it <ipfs-container-name> ipfs repo gc
# Check repo size after cleanup to see space freed
docker exec -it <ipfs-container-name> ipfs repo stat
Aggressive Garbage Collection
For more thorough cleanup, you can run garbage collection with additional flags:
# More aggressive cleanup (removes all unreferenced blocks)
docker exec -it <ipfs-container-name> ipfs repo gc --stream-errors
# Verify integrity after aggressive cleanup
docker exec -it <ipfs-container-name> ipfs repo verify
Monitoring Garbage Collection
To monitor the garbage collection process:
# Check IPFS logs during garbage collection
docker logs -f <ipfs-container-name>
# Check system resources during the process
docker stats <ipfs-container-name>
Note: Garbage collection can be resource-intensive and may temporarily impact node performance. Consider running it during low-traffic periods.
Node Synchronization Issues
Checking Sync Status
Check if your node is properly synchronized:
# Check the node's sync status
curl -s http://localhost:4024/api/v0/info/public.json | jq '.sync_status'
Resyncing a Node
If your node is out of sync or having persistent issues, you can perform a complete resynchronization:
Stop the node services:
bashdocker-compose down
Remove Database, IPFS and PyAleph data directories (prune volumes not used):
bashdocker volume prune -f
Restart the node:
bashdocker-compose up -d
Monitor the synchronization process:
bashdocker-compose logs -f
Network Connectivity Issues
Checking Port Accessibility
Verify that your node's ports are accessible from the internet:
# From another machine, check if the ports are open
nc -zv YOUR_NODE_IP 4001
nc -zv YOUR_NODE_IP 4024
nc -zv YOUR_NODE_IP 4025
Firewall Configuration
If ports are not accessible, check your firewall settings:
# For UFW (Ubuntu/Debian)
sudo ufw status
sudo ufw allow 4001/tcp
sudo ufw allow 4001/udp
sudo ufw allow 4024/tcp
sudo ufw allow 4025/tcp
Performance Monitoring
Container Resource Usage
Monitor your containers' resource usage:
# View resource usage statistics
docker stats
# Install and use ctop for a better interface
docker run --rm -ti --name=ctop --volume /var/run/docker.sock:/var/run/docker.sock quay.io/vektorlab/ctop
Checking Node Metrics
Access the node's metrics endpoint to check performance:
# Get general metrics
curl -s http://localhost:4024/metrics
# Filter for specific metrics
curl -s http://localhost:4024/metrics | grep messages
Common Error Messages
"Cannot connect to the Docker daemon"
This usually means Docker is not running or you don't have permission to access it.
Solution:
# Start Docker service
sudo systemctl start docker
# Check if your user is in the docker group
groups $(whoami)
# Add user to docker group if needed
sudo usermod -a -G docker $(whoami)
"No space left on device"
This indicates your disk is full, often due to Docker using too much space.
Solution:
# Check available disk space
df -h
# Clean up Docker resources
docker system prune -a -f
"Could not connect to Ethereum node"
Your node can't connect to the configured Ethereum API endpoint.
Solution:
- Check your internet connection
- Verify the API URL in your config.yml
- Try an alternative Ethereum API provider
Seeking Further Help
If you're still experiencing issues after trying these troubleshooting steps:
- Visit the Aleph.im Community on Discord
- Review the Node Monitoring documentation for additional diagnostic tools