Skip to content

Navid's Blog

Ideas, Experiments, and Lessons Learned

Menu
Menu

The Time I Spent 3 Hours Debugging a ‘Simple’ API Issue

Posted on March 26, 2026 by Navid

It Started With a Simple Request

Monday morning. Coffee in hand. I open my laptop ready to knock out some tickets. Then I see it — a Slack message from our QA team:

Hey, the user profile endpoint is returning 500 errors for some users.

Easy fix, I thought. Probably some null pointer or missing validation. How wrong I was.

The Investigation Begins

I pulled up the logs. The error message wasn’t helpful at all — just a generic “Internal Server Error.” Classic.

So I fired up Postman and hit the endpoint. Worked fine for my test user. Tried another one. Still fine. Then I tried the specific user ID from the bug report:

GET /api/users/12345

Boom. 500 error. Okay, now we’re getting somewhere.

First Wrong Turn

I started checking the database. User 12345 exists. All fields look normal. I added more logging to the endpoint and redeployed. Still nothing useful.

An hour passed. I was starting to feel stupid.

The Real Problem

Then I remembered — we recently added a new field to the user profile: preferences. It’s a JSON column that stores notification settings, theme choices, that kind of thing.

I checked the data for user 12345. The preferences field was there, but it was malformed JSON. Someone’s manual database insert had broken the JSON structure years ago, and it never mattered until our new code tried to parse it.

Why This Beat Me

  • I assumed it was new code. I wasted time looking at recently changed files when the problem was old corrupted data.
  • I tested with ‘good’ data. My test users all had clean data. I never thought to test with messy, real-world data.
  • The error message lied. It didn’t tell me anything about the JSON parse failure.

The Fix (Finally)

I added a try-catch around the JSON parsing with a more descriptive error message. Then I fixed the corrupted data in the database. Done in 10 minutes.

The real fix, though, was adding data validation at the database level so this can’t happen again.

What I Learned

  1. Always check the data, not just the code. Sometimes the problem isn’t in what you wrote — it’s in what’s already in the database.
  2. Test with real data. Clean test data won’t catch the bugs that messy production data causes.
  3. Improve error messages as you go. That generic 500 cost me an hour. If the error had said “JSON parse failed in preferences field,” I’d have been done in 10 minutes.

Now whenever I see a mysterious 500 error, the first thing I check is the data. Not the code. The data.

Categories

  • AI Experiments
  • Coding
  • Debugging Stories
  • Hot Takes
  • Ideas
  • Lessons Learned
  • Project Management
  • Uncategorized
  • Vibe Coding

Recent Posts

  • How I Handled My First Production Outage (And What I Learned)
  • I Finally Fixed Our Slow Database Queries — Here’s What Actually Worked
  • I Finally Fixed Our Slow Database Queries — Here’s What Actually Worked
  • Why I Stopped Using Microservices for Small Projects
  • I Gave AI Full Access to Our Production Database. Here’s What Happened