Logs that don't lie: wide events and one event per request

TL;DR - One rich event per request (wide event) with business context instead of N lines; tail sampling to keep cost under control. Reference: Logging Sucks.


I’ve lost time grepping logs to find why a checkout failed or why p99 spiked at night. The problem isn’t just volume: logs were designed to be written, not queried. A request goes through several services, DB, cache, queue, and what’s left are dozens of lines per request, each with a slice of context. When something goes wrong, the unified context is what’s missing.

The shift: “what happened” instead of “what the code did”

The site Logging Sucks (Boris Tane) sums up the mindset change well. Instead of logging “what my code did”, log “what happened with this request”. Instead of many lines per hop, a single wide event per service with everything that might matter for debug:

  • request_id, trace_id, method, path, status, duration
  • user (id, plan, LTV), cart, payment
  • error (type, code, retriable), feature flags

One event per request as the canonical record of what happened.

flowchart TB
  subgraph antigo[" Old model "]
    R[Request] --> L1[log 1]
    R --> L2[log 2]
    R --> L3[log N...]
  end
  subgraph wide[" Wide event "]
    R2[Request] --> E[1 rich event]
  end

Searching by user_id or request_id returns a single record with enough dimensions to understand the case. It’s no longer text search; it’s querying structured data. High cardinality (user_id, request_id) and high dimensionality (many fields) is what makes the log useful. OpenTelemetry helps with transport and trace but doesn’t decide what to log or inject business context; that’s still an instrumentation decision.

In practice: building the event and controlling cost

I build the event along the request: the middleware starts with request_id, method, path, service, etc.; the handler enriches with user, cart, payment, error. At the end I emit once.

To avoid blowing up cost I use tail sampling: keep 100% of errors and slow requests (above p99) and sample the rest (e.g. 5%). What matters is stored and the bill stays predictable.

Real case: ETL in Python

I applied this on a Python ETL pipeline that scales to 20 instances with multiple threads. Before, each run (e.g. for table CADIVEND) produced several lines:

[2025-11-18T14:32:01.123Z] run_id=etl_8f7a2b3c table=CADIVEND thread_id=worker-3 status=running etapa=início
[2025-11-18T14:32:03.123Z] run_id=etl_8f7a2b3c table=CADIVEND thread_id=worker-3 status=running etapa=session_open duration_stage_ms=2000
[2025-11-18T14:32:04.123Z] run_id=etl_8f7a2b3c table=CADIVEND thread_id=worker-3 status=running etapa=read_parquet parquet_count=25 duration_stage_ms=1000
[2025-11-18T14:32:42.123Z] run_id=etl_8f7a2b3c table=CADIVEND thread_id=worker-3 status=running etapa=write_iceberg duration_stage_ms=38000
[2025-11-18T14:32:43.123Z] run_id=etl_8f7a2b3c table=CADIVEND thread_id=worker-3 status=running etapa=session_close duration_stage_ms=1000
[2025-11-18T14:32:46.123Z] run_id=etl_8f7a2b3c table=CADIVEND status=success etapa=finalizado duration_ms=45000

If it failed in the middle, you’d see the steps up to that point. With 20 instances and several threads, volume exploded and filtering success vs error and finding the table or job that broke was a pain.

I switched to one log per run: one event with table, instance, thread_id, total duration, duration per stage (session, parquet read, Iceberg write, close), number of parquets read, status (success/failed) and on error the stage that failed and the message. One line per run. It cut log volume a lot, made it easier to monitor errors and successes (one filter by status and table does it) and debug got a single record with the full context of that run.

Example: success event

{
  "timestamp": "2025-11-18T14:32:01.123Z",
  "run_id": "etl_8f7a2b3c",
  "table": "CADIVEND",
  "instance_id": "i-0a1b2c3d",
  "thread_id": "worker-3",
  "status": "success",
  "duration_ms": 45000,
  "stages_ms": {
    "session_open": 2000,
    "read_parquet": 1000,
    "write_iceberg": 38000,
    "session_close": 1000
  },
  "parquet_count": 25
}

Example: failure event (e.g. Iceberg write)

{
  "timestamp": "2025-11-18T14:35:22.456Z",
  "run_id": "etl_9x8w7v6u",
  "table": "CADIVEND",
  "instance_id": "i-0a1b2c3d",
  "thread_id": "worker-7",
  "status": "failed",
  "failed_stage": "write_iceberg",
  "duration_ms": 23000,
  "stages_ms": {
    "session_open": 2000,
    "read_parquet": 1000,
    "write_iceberg": 20000
  },
  "parquet_count": 25,
  "error": {
    "type": "CommitFailedException",
    "message": "Conflict with concurrent commit",
    "retriable": true
  }
}

Takeaways

  • One rich event per request (or per unit of work) instead of N fragmented lines.
  • Business context in the event: user, cart, payment, ETL table, stage. Not just request_id and status.
  • Tail sampling to keep cost under control: 100% of errors and slow, sample the rest.
  • Structured query: search by user_id, request_id, table, status and get one record with what matters.

A log that doesn’t lie is a log you can actually use.