Avoid false errors from expected 4xx responses ΒΆ

If your service shows a high error rate in Nais APM but most "errors" are actually HTTP 404 or 400 responses that are normal for your application, this guide explains why it happens and how to fix it.

The problem ΒΆ

Many web frameworks use exceptions for flow control β€” throw a NotFoundException, and a handler like Ktor's StatusPages or Spring Boot's @ExceptionHandler converts it to an HTTP 404.

The OTel auto-instrumentation agent hooks into your framework and sees that exception propagate before it gets caught. It marks the span as STATUS_CODE_ERROR and records the exception. By the time your framework returns a 404, the error is already recorded.

The result: APM dashboards count these as server errors even though the HTTP response is a normal 4xx.

The OTel spec is clear about what should happen β€” HTTP semantic conventions say:

For HTTP status codes in the 4xx range, span status MUST be left unset in case of SpanKind.SERVER.

The agent follows this rule for status codes, but thrown exceptions override it. This is a known limitation in the spec.

When to fix it ΒΆ

This is worth fixing on routes where:

  • The 4xx response is a normal business outcome (e.g., "person not found" returns 404)
  • The route gets a lot of traffic. A high-volume route with 404s will dominate your error rate
  • You want reliable error-rate alerting from APM

If the route is low-volume or the 4xx is rare, the noise probably doesn't matter.

How to fix it ΒΆ

Return the HTTP response directly instead of throwing an exception. The span status stays UNSET (correct for 4xx), and APM counts only real 5xx errors.

Ktor ΒΆ

kotlin

Your StatusPages configuration can stay. It still handles unexpected exceptions. You're only changing the routes where 4xx is an expected outcome.

Spring Boot (Kotlin) ΒΆ

kotlin

Verify the fix ΒΆ

After deploying, the route should still return the same HTTP status code, but the STATUS_CODE_ERROR count should drop. Check in Nais APM or query span metrics directly:

promql

This should return 0 (or close to it) for the fixed routes.

References ΒΆ