Chapter 19: Observability and Operations — Identity You Can Actually Run

This is Part 19 of a chapter-by-chapter walkthrough of my book OpenID: Modern Identity for Developers and Architects. In the previous chapter we covered claims and privacy. Chapter 19 closes Part VI with the work that keeps identity systems actually working in production.

19.1 — Authentication Logs

Authentication is the critical path — when it breaks, users can't get in, support queues fill, revenue stops. You need logs that let you answer the usual on-call questions within seconds: who authenticated, from where, against which IdP, did it succeed, if not why.

Structured logging is the baseline. Instead of free-form strings, emit JSON with consistent fields: event_type, user_id, client_id, issuer, source_ip, user_agent, and a correlation ID that threads through every stage of the flow. The correlation ID is how you trace one user's failed login across your RP, your IdP, and any downstream services, without playing log-line tag.

Protect these logs like security data, because they are: encrypt at rest, restrict access by role, keep audit trails immutable. Authentication logs are also sensitive — don't log full tokens, don't log password attempts, don't log anything that would embarrass you if the log store leaked.

19.2 — Tracing Login Flows

Modern login is distributed: browser → your RP → IdP → callback to your RP → backend → userinfo endpoint → session creation. When it's slow, or intermittently fails, or works for everyone except one user, you need distributed tracing to find out where.

Each hop emits a span, tied by trace ID, with timing and metadata. Tools like OpenTelemetry, Jaeger, and cloud-native tracers aggregate spans into a single view. With this, "login is slow" becomes "the userinfo call to the IdP is taking 800ms on p95 — their issue or ours?"

Key idea: Observability isn't a dashboard; it's the ability to ask arbitrary questions of production behavior. For identity, that means correlation IDs that cross organizational boundaries, structured logs that aggregate well, and traces that span from browser to IdP to database.

19.3 — Audit Trails

Audit logs are different from operational logs. They're for regulators, auditors, and forensics. They must be immutable: append-only storage, cryptographic signing of entries, retention policies that align with HIPAA / SOX / PCI / GDPR as applicable.

Events to audit: admin actions (creating users, changing roles, rotating keys), security events (failed logins at rate, suspicious geolocations, MFA bypass attempts), and compliance-relevant data access. For each event, record who, what, when, before-state, after-state, and outcome. When an auditor asks "did anyone touch this customer's data in the last 90 days?", the answer should take a query, not a week.

What Chapter 19 Sets Up

After Chapter 19 you should have a clear picture of the operational instrumentation an identity system needs: structured authentication logs with correlation IDs, distributed traces across the whole login flow, and immutable audit trails for the subset of events that matter to regulators. This is the difference between an identity system you have and an identity system you can actually run.

Next up — Chapter 20: Passwordless Authentication. We open Part VII (The Future of Identity) with the pattern that's already replacing passwords: passkeys, WebAuthn, and FIDO2. The end state isn't "stronger passwords" — it's no passwords at all.

Want the full picture? Grab OpenID: Modern Identity for Developers and Architects here for the full observability stack, sample audit schemas, and the rest of the 22-chapter journey through modern identity.

2026-03-25

openid connect

oidc

observability

logging

distributed tracing

audit trail

operations

sre

book series

Sho Shimoda

I share and organize what I’ve learned and experienced.

Search Logs

Hello World bot 1194 Deploy Teams bot to Azure 1155 IT assistant bot 1151 Microsoft Bot Framework 1064 Teams bot development 1037 Teams production bot 1016 bot for sprint updates 1010 Teams app zip 995 Zendesk Teams integration 993 Microsoft Teams Task Modules 986 Bot Framework Adaptive Card 982 Bot Framework example 975 Task Modules 968 Teams chatbot 968 C 959 Teams bot tutorial 959 Azure CLI webapp deploy 958 Teams bot packaging 955 Bot Framework proactive messaging 948 Graph API token 947 Bot Framework CLI 941 Adaptive Card Action.Submit 936 Bot Framework prompts 924 Azure App Service bot 916 Microsoft Graph 915 Azure Bot Services 896 Adaptive Cards 886 Azure bot registration 883 ServiceNow bot 871 proactive messages 829

Development & Technical Consulting

Working on a new product or exploring a technical idea? We help teams with system design, architecture reviews, requirements definition, proof-of-concept development, and full implementation. Whether you need a quick technical assessment or end-to-end support, feel free to reach out.