Troubleshooting

This guide covers common issues encountered when running the eTeamups Platform and provides step-by-step resolutions for each scenario.

Service Startup Failures

When one or more services fail to start, begin by checking container status and logs.

Diagnose

Check the status of all containers:

docker compose ps

View logs for the failing service:

docker compose logs <service-name>

Common Causes

Cause Symptom Resolution
Missing environment variables Service exits immediately with a configuration error Ensure docker.env is complete and contains all required variables. Refer to the Environment Variables documentation.
Port conflicts Bind error in logs (address already in use) Identify the conflicting process with lsof -i:<port> and either stop the process or change the port mapping in docker-compose.yml.
Dependency not ready Connection refused errors to MongoDB or Redis Ensure MongoDB and Redis containers are healthy before starting application services. Run docker compose ps and verify both show a healthy status.

Resolution Steps

  1. Verify that docker.env exists and contains all required values.

  2. Check for port conflicts:

    lsof -i:27018   # MongoDB
    lsof -i:6379    # Redis
    lsof -i:9000    # Auth Service
    lsof -i:9100    # Profile Service
    lsof -i:9107    # Organisation Service
    lsof -i:9102    # Media Service
    
  3. Ensure infrastructure services are healthy before starting application services:

    docker compose up -d mongodb redis
    docker compose ps   # Wait until both show "healthy"
    docker compose up -d
    

Database Connection Issues

Testing MongoDB Connectivity

Connect to MongoDB directly from the host:

mongosh "mongodb://admin:password@localhost:27018/eteamups?authSource=admin"

Check the MongoDB container logs for errors:

docker compose logs mongodb

Verify MongoDB is responding to commands:

docker exec eteamups-mongodb mongosh --eval "db.adminCommand('ping')"

Common MongoDB Issues

Issue Symptom Resolution
Authentication failure MongoServerError: Authentication failed Verify that MONGO_INITDB_ROOT_USERNAME and MONGO_INITDB_ROOT_PASSWORD in docker.env match the credentials used in connection strings. If you changed credentials after initial setup, you may need to remove the MongoDB volume and reinitialize: docker compose down -v (warning: this deletes data).
Connection refused ECONNREFUSED 127.0.0.1:27018 Confirm the MongoDB container is running with docker compose ps. Verify that port 27018 is exposed in docker-compose.yml. Check that no firewall rules are blocking the port.
Database not initialized Collections missing or empty Run the application seed scripts or verify that the init scripts in the MongoDB container executed successfully by checking docker compose logs mongodb.

Redis Connection Issues

Testing Redis Connectivity

Test the Redis connection from within the container:

docker exec eteamups-redis redis-cli -a "$REDIS_PASSWORD" ping

A successful connection returns PONG.

Check Redis memory usage:

docker exec eteamups-redis redis-cli -a "$REDIS_PASSWORD" info memory

Common Redis Issues

Issue Symptom Resolution
Connection refused ECONNREFUSED in application logs Redis is not running or is bound to the wrong port. Check with docker compose ps and verify the Redis container is healthy.
Authentication error NOAUTH Authentication required or ERR invalid password The password configured in docker.env does not match the password the Redis container was initialized with. Verify REDIS_PASSWORD is consistent across all service configurations.
Message queue not processing Jobs stuck in queue, workers idle Check the message-queue worker logs with docker compose logs message-queue. Verify the Redis connection string used by the worker matches the running Redis instance. Restart the worker if needed: docker compose restart message-queue.

Docker and Container Issues

Image Pull Failures

If Docker cannot pull images from GitHub Container Registry (GHCR), re-authenticate:

echo $GITHUB_TOKEN | docker login ghcr.io -u USERNAME --password-stdin

Ensure the GITHUB_TOKEN has read:packages scope.

Out of Memory

Check current resource consumption:

docker stats

If the host is running low on memory, add swap space:

sudo fallocate -l 2G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

To make the swap permanent, add an entry to /etc/fstab:

/swapfile swap swap defaults 0 0

Container Restart Loop

When a container continuously restarts:

  1. Check the logs for the root cause:

    docker compose logs <service-name> --tail 50
    
  2. Verify all required environment variables are set in docker.env.

  3. Check resource limits – the container may be getting killed by the OOM killer. Review docker stats output.

  4. Inspect the container’s exit code:

    docker inspect <container-name> --format='{{.State.ExitCode}}'
    

Volume Permission Issues

If a service reports permission denied errors when accessing mounted volumes:

  1. Check file ownership inside the container:

    docker exec <container-name> ls -la /path/to/volume
    
  2. Adjust ownership on the host if needed, matching the UID/GID used inside the container.

Network Issues

If services cannot communicate with each other:

  1. List Docker networks:

    docker network ls
    
  2. Verify all services are attached to the same network:

    docker network inspect <network-name>
    
  3. Test connectivity between containers:

    docker exec <container-name> wget -qO- http://<target-service>:<port>/health
    

SSL/TLS Certificate Issues

Verifying Certificate Files

Check that the required certificate files exist:

ls -la nginx/ssl/

The directory must contain:

  • fullchain.pem – The full certificate chain (server certificate plus intermediate certificates).
  • privkey.pem – The private key.

Generating Self-Signed Certificates for Testing

For local development and testing environments:

./scripts/generate-ssl.sh

Common SSL Issues

Issue Symptom Resolution
Missing certificate files Nginx fails to start with cannot load certificate Ensure fullchain.pem and privkey.pem exist in nginx/ssl/. Generate self-signed certificates for testing if needed.
Permission denied Nginx cannot read certificate files Certificates must be readable by the Nginx user. Check permissions with ls -la nginx/ssl/ and adjust: chmod 644 nginx/ssl/fullchain.pem and chmod 600 nginx/ssl/privkey.pem.
Certificate expired Browser shows NET::ERR_CERT_DATE_INVALID Replace the expired certificate files with renewed ones and restart Nginx: docker compose restart nginx.
Mixed content warnings Browser console shows mixed content errors Ensure BASE_URL and CORS_ORIGIN in docker.env use https:// rather than http://. All API endpoints and frontend URLs must use HTTPS in production.

Performance Issues

High Memory Usage

Monitor memory consumption across all containers:

docker stats --no-stream --format "table {{.Name}}\t{{.MemUsage}}\t{{.MemPerc}}"

Each application service is limited to 512 MB by default (see resource limits in Monitoring). If a service consistently approaches its limit, investigate for memory leaks in the application logs.

Slow API Responses

Query the apilogs collection in MongoDB to identify slow endpoints:

// Find requests with processing time greater than 1 second
db.apilogs.find({ processingTime: { $gt: 1000 } }).sort({ startTime: -1 }).limit(20).pretty()

// Average processing time by endpoint
db.apilogs.aggregate([
  { $group: { _id: "$url", avgTime: { $avg: "$processingTime" }, count: { $sum: 1 } } },
  { $sort: { avgTime: -1 } },
  { $limit: 20 }
])

MongoDB Slow Queries

Enable the MongoDB profiler to capture slow queries:

// Enable profiling for queries slower than 100ms
db.setProfilingLevel(1, { slowms: 100 })

// Review slow queries
db.system.profile.find().sort({ ts: -1 }).limit(10).pretty()

Check that indexes are properly defined for frequently queried fields.

Redis Memory Full

Check current Redis memory usage:

docker exec eteamups-redis redis-cli -a "$REDIS_PASSWORD" INFO memory

If Redis memory is approaching the maxmemory limit:

  1. Review the eviction policy (default is allkeys-lru).

  2. Increase maxmemory in the Redis configuration if the host has available resources.

  3. Identify large keys consuming excessive memory:

    docker exec eteamups-redis redis-cli -a "$REDIS_PASSWORD" --bigkeys
    

Too Many Connections / Rate Limiting

The platform enforces rate limits at two levels:

Layer Limit Scope
Application 100 requests per 15 minutes Per IP, applied by Express middleware
Nginx 10 requests per second Per IP, applied at the reverse proxy level

If legitimate traffic is being rate-limited, review and adjust the limits in the application rate limiter configuration and the Nginx limit_req_zone directives.

Common Error Codes

Status Code Meaning Typical Cause Resolution
401 Unauthorized Missing or invalid authentication The request does not include a valid Bearer token in the Authorization header. Ensure the client sends a valid access token. Obtain a new token via the /auth/login endpoint.
403 Forbidden Token expired or insufficient permissions The access token has expired, or the refresh token is invalid. Use the /auth/refresh-token endpoint to obtain a new access token. If the refresh token is also expired, the user must log in again.
404 Not Found Resource does not exist The requested account, profile, or resource was not found, or the route is invalid. Verify the request URL is correct. Check that the referenced resource ID exists in the database.
429 Too Many Requests Rate limit exceeded The client has sent too many requests in the allowed time window. Wait for the rate limit window to reset before retrying. Implement exponential backoff in the client.
500 Internal Server Error Unhandled server error An unexpected error occurred in the application. Check the service logs for the full stack trace: docker compose logs <service-name> --tail 100.

Log Analysis for Debugging

Application Logs

View the most recent application logs for a specific service:

docker compose logs <service-name> --tail 100 -f

Nginx Logs

Use the built-in log analysis scripts:

# View logs interactively
./scripts/view-logs.sh

# Monitor logs in real time
./scripts/log-monitor.sh -r

# Analyze errors
./scripts/log-monitor.sh -e

# Analyze performance
./scripts/log-monitor.sh -p

API Request Logs in MongoDB

Query the apilogs collection for detailed request-level debugging:

docker exec eteamups-mongodb mongosh eteamups --eval "db.apilogs.find().sort({startTime:-1}).limit(10).pretty()"

For more targeted queries, connect to MongoDB directly:

// Find all failed requests (5xx) in the last hour
db.apilogs.find({
  startTime: { $gte: new Date(Date.now() - 3600000) },
  url: { $not: /health/ }
}).sort({ startTime: -1 })

// Find requests for a specific endpoint
db.apilogs.find({ url: { $regex: /\/auth\/login/ } }).sort({ startTime: -1 }).limit(10)

Useful Diagnostic Commands

A quick reference of commands for diagnosing platform issues:

# Run the full platform health check
./scripts/health-check.sh

# Check Docker resource usage (one-time snapshot)
docker stats --no-stream

# Check disk space on the host
df -h

# Check running containers and their status
docker compose ps

# Kill processes occupying specific ports
./scripts/kill-ports.sh

# View recent API logs from MongoDB
docker exec eteamups-mongodb mongosh eteamups --eval "db.apilogs.find().sort({startTime:-1}).limit(10).pretty()"

# Inspect a specific container's configuration
docker inspect <container-name>

# Check Docker disk usage
docker system df

# View Docker network configuration
docker network ls