Observability

Overview

The platform implements a multi-layered observability strategy:

  • OpenTelemetry for distributed tracing and structured log export.
  • Winston for application-level logging with configurable transports.
  • MongoDB for persistent API request logging.
  • SigNoz as the backend for trace visualization, log aggregation, and dashboards.

Instrumentation

The platform provides two instrumentation setups, each tailored to different runtime contexts.

Root Instrumentation (instrumentation.ts)

This is the primary instrumentation configuration, used by the message-queue worker and other standalone processes.

SDK: Uses the OpenTelemetry NodeSDK, the high-level all-in-one setup that handles provider registration, context propagation, and shutdown coordination.

Trace Export:

  • Protocol: OTLP over HTTP.
  • Endpoint: OTEL_EXPORTER_OTLP_ENDPOINT environment variable, with /v1/traces appended.

Log Export:

  • Protocol: OTLP over HTTP.
  • Endpoint: OTEL_EXPORTER_OTLP_LOGS_ENDPOINT environment variable.
  • Processor: BatchLogRecordProcessor, which buffers log records and exports them in batches for efficiency.

Resource:

  • Service name: OTEL_SERVICE_NAME environment variable, defaulting to zeswa-platform.

Auto-Instrumentations:

  • All available instrumentations are enabled by default.
  • The fs (file system) instrumentation is explicitly disabled to avoid excessive span generation from routine file operations.

Winston Integration:

  • Winston instrumentation is enabled to inject resource.service.name into all log records. This ensures that logs exported via OTLP carry the correct service identity for correlation in the backend.

Shutdown:

  • Registers a SIGTERM handler that calls sdk.shutdown() for graceful cleanup of exporters and processors.

Per-Service Instrumentation (libs/server/src/tracer.ts)

This instrumentation is used by individual API services and is conditionally enabled based on the OPEN_TELEMENTRY environment variable (must be set to true).

SDK: Uses the lower-level NodeTracerProvider for fine-grained control over tracer configuration.

Trace Export:

  • Protocol: OTLP over protobuf.
  • Endpoint: Configured via standard OTLP environment variables.

Instrumentations:

  • Only Http and Express instrumentations are registered, keeping overhead minimal for API services.

Sampling:

  • A custom filterSampler is applied that drops spans for the /health endpoint. This prevents health check polling from generating noise in the trace data.

Return Value:

  • The setup function returns a named tracer instance that services use to create custom spans.

Application Logging

Winston (libs/utility/src/Logger.ts)

The application logger is built on Winston and supports multiple transport targets:

Transport Description
Console Writes formatted log output to stdout/stderr. Always active.
File Writes log output to rotating log files on disk.
OTLP Exports log records to the OpenTelemetry backend when instrumentation is enabled.

Log Levels

The logger supports the standard severity levels:

Level Usage
debug Detailed diagnostic information for development and troubleshooting.
info General operational events such as service startup, request handling, and job completion.
warn Unexpected conditions that do not prevent operation but may indicate a problem.
error Failures that require attention, such as unhandled exceptions or external service errors.

The active log level threshold is controlled by the LOG_LEVEL environment variable. Only messages at or above the configured level are emitted.

API Request Logging

BasicMiddleware (services/common/src/BasicMiddleware.ts)

Every inbound API request is logged to MongoDB for auditing and debugging. The BasicMiddleware class captures the following fields:

Field Description
rawHeaders The raw HTTP headers as received.
url The request URL path.
method The HTTP method (GET, POST, PUT, DELETE, etc.).
httpVersion The HTTP protocol version.
remoteAddress The client’s IP address.
remoteFamily The IP address family (IPv4 or IPv6).
body The parsed request body.
params URL path parameters.
query URL query string parameters.
startTime Timestamp when the request was received.
endTime Timestamp when the response was sent.
processingTime The elapsed time between startTime and endTime.

Request logs are persisted via the APILogModel MongoDB model. All service middleware classes extend BasicMiddleware, so API request logging is automatically applied to every route across all services.

SigNoz Backend

SigNoz serves as the unified observability backend, receiving both traces and logs via the OTLP protocol.

Default endpoint: http://145.223.18.69:4318

SigNoz provides the following capabilities:

  • Trace visualization: End-to-end request tracing across services, with flame graphs and span detail views.
  • Log aggregation: Centralized log search and filtering, with correlation to traces via shared context.
  • Dashboards: Customizable dashboards for monitoring service health, latency percentiles, error rates, and throughput.

Environment Variables

Variable Purpose Default
OPEN_TELEMENTRY Enables per-service instrumentation when set to true.
OTEL_SERVICE_NAME Service name attached to all telemetry data. zeswa-platform
OTEL_EXPORTER_OTLP_ENDPOINT Base URL for the OTLP trace exporter.
OTEL_EXPORTER_OTLP_LOGS_ENDPOINT URL for the OTLP log exporter.

Code Locations

Component Path
Root instrumentation instrumentation.ts
Per-service tracer libs/server/src/tracer.ts
Application logger libs/utility/src/Logger.ts
API request middleware services/common/src/BasicMiddleware.ts