Gino Eising
Nerd by Nature
Sep 11, 2025 6 min read

Zero to observability in a Java app: OpenTelemetry agent, Prometheus, and Grafana Tempo

Cover: after Edward Hopper, Office at Night — observability as solitary nocturnal vigil — a figure at the desk, warm amber monitor light, faint dashboards on the wall.

September 2025 — adding real observability to a Java service without touching a line of application code

Logs tell you something happened. Metrics tell you how often. Traces tell you exactly what happened, in what order, for how long, across which services.

Most Java applications in the wild have the first. Some have the second. Almost none have all three wired together properly. This post is about setting up the full stack from scratch — OTel Java agent, Prometheus, Grafana Tempo — using a reference Spring Boot application as the subject.

The interesting part: you don’t have to modify the application code at all.

What we’re instrumenting

The demo application is a Spring Boot REST API — a deliberately simple “task manager” (subtitled “USELESS task manager” in the UI, in the spirit of honest naming). It has a load generator that hits the endpoints continuously, and a PostgreSQL backend.

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│                 │     │                 │     │                 │
│  Load Generator │────▶│  Spring Boot    │◀────│    PostgreSQL   │
│                 │     │  Application    │     │                 │
└────────┬────────┘     └────────┬────────┘     └─────────────────┘
         │                       │
         └──────────────────────▶│
                                 ↓
                    ┌─────────────────────┐
                    │ OpenTelemetry       │
                    │ Collector           │
                    └───┬───────┬─────────┘
                        │       │
                        ↓       ↓
               Prometheus    Grafana Tempo
                    │              │
                    └──────┬───────┘
                           ↓
                        Grafana

The OTel Collector is the central hub: it receives spans and metrics from the instrumented application, processes them, and exports to the right backends.

The agent: zero code change instrumentation

The OpenTelemetry Java agent is a JAR that attaches to the JVM at startup via -javaagent. It instruments popular frameworks (Spring Boot, JDBC, HTTP clients) automatically, emitting spans for every incoming request, outgoing database query, and HTTP call.

FROM eclipse-temurin:17-jre-alpine

COPY app.jar /app/app.jar
COPY opentelemetry-javaagent.jar /app/otel-agent.jar

ENV JAVA_OPTS="-javaagent:/app/otel-agent.jar"
ENV OTEL_EXPORTER_OTLP_ENDPOINT="http://otel-collector:4317"
ENV OTEL_SERVICE_NAME="task-manager"
ENV OTEL_METRICS_EXPORTER="otlp"
ENV OTEL_TRACES_EXPORTER="otlp"

CMD java $JAVA_OPTS -jar /app/app.jar

Three environment variables and a JAR on the classpath. No changes to pom.xml, no Spring dependencies, no @Trace annotations. The agent does the rest.

What you get automatically:

A span for every HTTP request (GET /api/tasks, POST /api/tasks/{id})
Child spans for every JDBC query (SELECT * FROM tasks WHERE id=?)
HTTP client spans if the app makes outbound calls
JVM metrics: heap usage, GC pause time, thread count
Spring actuator metrics if the endpoint is exposed

The Collector configuration

The OTel Collector is what separates concerns. The application sends everything to one place (the Collector). The Collector decides what goes where.

# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 1s
  spanmetrics:
    metrics_exporter: prometheus

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889"
  otlp/tempo:
    endpoint: tempo:4317
    tls:
      insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch, spanmetrics]
      exporters: [otlp/tempo]
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [prometheus]

The spanmetrics processor is the interesting piece. It derives Prometheus metrics directly from span data: traces_spanmetrics_latency_bucket, traces_spanmetrics_calls_total. This gives you latency histograms (p50, p95, p99) for every endpoint without any additional instrumentation.

In Grafana, that means you can build a latency heatmap over all your endpoints and alert on it, without writing a single custom metric.

Tempo instead of Jaeger

I started with Jaeger. It works fine. But Grafana Tempo integrates more cleanly with Grafana Cloud and the self-hosted Grafana — no separate Jaeger UI to manage, traces show up inline with the metrics dashboard, and TraceQL (Tempo’s query language) is more expressive for finding specific traces.

The switch is a one-line config change in the Collector: otlp/jaeger becomes otlp/tempo with the new endpoint. Tempo uses the same OTLP protocol; nothing about the application or agent changes.

# Tempo minimal config
server:
  http_listen_port: 3200

distributor:
  receivers:
    otlp:
      protocols:
        grpc:

storage:
  trace:
    backend: local
    local:
      path: /tmp/tempo/blocks

For production, swap local for S3/GCS/Azure. For a dev environment or learning setup, local storage is fine.

The Grafana dashboard setup

With metrics in Prometheus and traces in Tempo, the Grafana setup is:

Add Prometheus datasource: http://prometheus:9090
Add Tempo datasource: http://tempo:3200
In the Tempo datasource config, link to Prometheus for span metrics: enables the “Go to metrics” button from a trace view

Once linked, the workflow is:

See a latency spike in the Prometheus metric → click the spike → Grafana finds traces from that time window → click a trace → see the exact DB query that was slow

That chain — metric anomaly → trace → root cause — is what proper observability means. Logs can tell you the query was slow after the fact. Traces show you which specific request was slow, what called it, and what it called downstream.

What I found during the demo

Running the load generator against the instrumented app produced an immediately visible pattern in the span metrics: POST /api/tasks was consistently at p99 > 200ms while GET /api/tasks was under 20ms. Obvious in hindsight — writes go through an ORM, reads are a simple SELECT — but without the trace data you’d be guessing.

The trace view showed the write path:

Spring MVC dispatch: 2ms
Hibernate session open: 1ms
JDBC INSERT INTO tasks: 190ms
Hibernate session close: 1ms

The 190ms is the actual database write. The rest is framework overhead under 5ms. If that were a production performance problem, you’d know exactly where to look: the write path, and specifically the DB insert. Not the ORM. Not the HTTP layer. Not something to rewrite in a faster language.

Reproducing this yourself

The full stack is in docker-compose.yml in the repository. One command:

docker compose up -d

Services:

app — Spring Boot task manager (port 8080)
load-generator — hits the app every 100ms
otel-collector — receives, processes, exports (ports 4317/4318/8889)
prometheus — scrapes Collector (port 9090)
tempo — receives traces (port 3200)
grafana — dashboards (port 3000)

Grafana at http://localhost:3000, default credentials admin/admin. The latency dashboard and trace explorer are pre-provisioned.

The honest caveat

Collector configuration has a learning curve. The YAML is verbose and the error messages when you misconfigure a pipeline are not always clear. Start with the minimal config above and add complexity incrementally.

The agent auto-instrumentation covers the most common cases but not everything. If you have custom business logic that spans multiple services — a pricing calculation that calls three microservices, for example — you’ll need to add manual spans for those. The OTel Java SDK makes this straightforward, but it does require code changes. I haven’t added manual spans for custom business logic yet because for this single-service example, the agent auto-instrumentation is entirely sufficient.

For a single service, the agent is sufficient. For a distributed system, plan for a mix of auto-instrumentation and manual spans at the boundaries that matter most.