Debugging Code You've Never SeenLesson 4.5
How to read logs and identify patterns in production errors
structured log format, log levels, timestamp reading, correlation IDs, identifying error frequency, log aggregation tools overview, reading JSON logs
Production Logs Are the Ground Truth
In production, you can't attach a debugger. Logs are your only window into what happened. Reading them well is a distinct skill from debugging locally.
Reading Structured Logs
// Typical structured log entry (JSON format)
{
"timestamp": "2024-01-15T14:23:45.123Z",
"level": "error",
"message": "Payment processing failed",
"requestId": "req_9f3a2c", // correlates all logs for one request
"userId": "usr_44821",
"orderId": "ord_99102",
"error": "stripe_card_declined",
"code": "card_declined",
"service": "payment-service",
"duration_ms": 1842
}
// Find all logs for one request (trace a complete failure)
cat app.log | grep '"requestId":"req_9f3a2c"' | jq .Identifying Patterns
# Count errors by type in the last hour
cat app.log | \
grep '"level":"error"' | \
jq '.error' | \
sort | uniq -c | sort -rn
# Output:
# 47 "stripe_card_declined"
# 12 "database_timeout"
# 3 "invalid_token"Correlation IDs (requestId, traceId) are essential — they let you follow one user's request across multiple log lines and multiple services. If a codebase doesn't add correlation IDs, that's a gap worth noting and potentially fixing.
