Inheriting a Legacy Security Nightmare — And Actually Fixing It

A field note on what it looks like to walk into a five-year-old codebase with no original authors, a fresh security audit, and a deadline.


The Setup

A few months ago, a leader I respect reached out. They had recently inherited ownership of an enterprise web platform — a system that had been quietly doing its job for nearly five years. The original architects, the engineers who built it, and most of the product stakeholders had long since moved on. What remained was a working product, a handful of documentation fragments, and no living memory of why certain decisions were made.

Then the security reports started landing.

Not one or two findings. Dozens. SQL injection. Authentication bypasses. Token issuance endpoints that had no business existing. Some of them had been open for over a year.

They asked if I could help. I said yes. This is what I learned.


The Codebase

The system is a multi-repo monorepo — several Angular front-end apps, a Node/Express server layer for each, and a shared microservices backend. Each layer was built at a different time by a different team. Some modules are polished. Some have the unmistakable fingerprints of “I needed to ship this by Friday.”

The repos had diverged. Region-specific variants had been forked from the main app and then evolved independently. One module had a routing file with a comment block from 2021 that referenced a wiki page that no longer existed.

This is normal. Five years is a long time in software.


Finding 1: The Authentication Surface

The first class of issues was in the front-end layer — two apps that together form the entry point for the platform.

Both apps had a POST handler that would accept a user identifier from the request body and mint a signed JWT in response. No session validation. No verification that the caller was actually logged in. Just: “you sent a user ID, here is a token.”

This is the kind of code that makes complete sense in the context it was written. Someone needed to bridge an external authentication system into a Node app. They had a form POST coming in, they trusted it was legitimate, and they wrote code that acted on that trust. At the time, behind a corporate VPN, it probably felt fine.

Five years later, with a formal third-party security audit in play, it was a critical finding.

The fix wasn’t obvious. The naive answer — “just validate the session server-side” — required the session cookie to be present on the request. But cookies are scoped by path, and the app wasn’t running under the right path for the session cookie to be sent. The fix required a coordinated change: relocate the first app to a new path that fell inside the cookie’s scope, redirect legacy traffic at the routing layer, and rebuild the authentication primitive from scratch using a database session lookup.

Then, for the second app: since it lives at a different path and can never receive the session cookie, it needed a different approach entirely. The answer was to stop minting tokens there completely. Remove the endpoint. Enforce a gate that only accepts tokens already issued by the first app. One authority, one trust anchor.

Shipping both took about two weeks across three PRs. The hardest part wasn’t the code — it was understanding the trust model well enough to know what the code should be.


Finding 2: SQL Injection at Scale

While the auth work was wrapping up, the second wave of security reports arrived.

Dozens of SQL injection findings across the backend microservices.

The root cause was the same in every single case. A query template with a named placeholder. A service method that substituted user input directly into the query string using string replacement. Then that string, with user data baked in, passed directly to the database driver’s non-parameterized execution path.

// The vulnerable pattern - repeated across the codebase
const query = queries.FETCH_DATA.replace('@param', userInput);
await db.execute(query);

The fix was equally uniform:

// The fixed pattern
await db.executeWithParams(queries.FETCH_DATA, { param: userInput });

The DAO layer already had a parameterized execution method. It was sitting right there, unused for this pattern, presumably because the original author didn’t know it existed or the pattern hadn’t been established yet when these service files were written.

File after file. Same fix. Different parameter names.

This is what happens when a pattern becomes a convention without being reviewed. The first person writes it one way. The next person copies it. Five years later you have dozens of instances of the same mistake, all surfaced in a single audit sweep.


What I Actually Learned

1. Legacy codebases aren’t broken — they’re contextless.

Every strange decision I found had a reason. It just wasn’t written down, or the person who knew it had left. The code isn’t wrong. It’s just been outlived by the threat model it was written against.

2. Understanding the trust model is the whole job.

The hardest part of the auth fix wasn’t writing the validation logic. It was sitting down and drawing out: who mints tokens? who verifies them? what cookies are available where and why? Once that was clear, the code was almost mechanical. Before that clarity, every PR felt risky.

3. Consistent bad patterns are actually a gift.

Dozens of SQL injection findings sounds catastrophic. But because the root cause was identical in every case, fixing them was systematic. One engineer, a clear pattern, and enough time. Inconsistent vulnerabilities — each with a different root cause — would have been far harder.

4. Documentation debt compounds like financial debt.

The hours spent reverse-engineering why the path routing worked the way it did, why certain cookies were set, why a particular environment flag existed — all of that was time not spent fixing things. Every future engineer who touches this system will either pay that cost again or benefit from the ADRs and notes I’m leaving behind.

5. Coming in as an outsider is a superpower.

I had no attachment to the original decisions. I could ask “why does this exist?” without anyone feeling criticized. The leader who brought me in had the same openness. That made hard conversations — “this needs to be deleted, not patched” — much easier than they might have been.


The Tool That Was Actually in the Room

I want to be honest about something that doesn’t get talked about enough in engineering write-ups: I didn’t do this alone, and the co-author wasn’t human.

I used an AI coding assistant — specifically Cursor with Claude — as a hands-on collaborator throughout this entire remediation. Not as an autocomplete tool. Not to generate boilerplate. As an actual working partner that wrote code, reasoned about trust models, drafted architecture decision records, and produced PR-ready diffs.

Here’s what the actual working loop looked like, as honestly as I can describe it:

  1. I’d share the problem — a security finding, a broken behaviour, a constraint I’d discovered in the codebase.
  2. The AI would propose a solution — sometimes code, sometimes a question back at me, sometimes a structured document laying out options.
  3. I’d push back. “This doesn’t work because the cookie isn’t scoped here.” “This will cause an infinite redirect.” “The infrastructure config is wrong — look at where the volume actually mounts.”
  4. The AI would revise. Not defensively. Just: absorb the feedback, update the model, try again.
  5. Repeat until I was confident enough to push.

Looking at the git log across both repos, 13 commits landed over the course of about a week. The path relocation, the session validation gate, the impersonation detection, the JWT entry gate on the second app, the infrastructure fix, the redirect loop fix — all of it went through this loop. The AI wrote most of the initial code. I caught the gaps. Together we got to something I was willing to put in production.

What surprised me wasn’t that the AI could write the code. That part I expected. What surprised me was how useful it was to have something that could hold the full context of the problem — multiple ADRs, a security finding write-up, a Node server file, a routing config — and reason across all of it at once. The kind of reasoning that would normally require a senior engineer who’d been on the codebase for months.

That said: the AI was wrong, regularly. It made assumptions about the environment. It occasionally proposed fixes that would have introduced new problems. It didn’t know things I knew from reading an infrastructure config at 11pm. The loop only worked because I knew enough to catch those gaps.

Reproducing production locally — the nginx proxy trick. Before any browser-driven testing could work, I had to solve a subtler problem: cookie-scoped authentication simply cannot be tested with separate localhost ports. A cookie scoped to a specific path on one port is invisible to an app running on a different port. To faithfully reproduce the production cookie flow, CORS behaviour, and CSRF constraints on my laptop, I stood up a local nginx reverse proxy on a custom hostname with path-based routing — unifying the home app, the reports app, and the microservice backend under a single origin, mirroring production exactly. Only then did the cookie handoff between apps behave the way it would in the real environment.

Testing without writing tests. One thing that genuinely surprised me: I never wrote a single line of Playwright test code. Cursor has an MCP integration that connects the AI agent directly to a live Chrome instance via Chrome DevTools Protocol. Combined with an agent browser tool, I could describe what I wanted to verify — “log in, navigate to the protected page, confirm the JWT gate blocks unauthenticated requests” — and the agent would drive the browser, execute the flow, take screenshots, and report back. End-to-end test coverage for security-critical flows, with zero test-authoring overhead on my end.

Attacking my own fixes. On the offensive side, I used Burp Suite Community Edition to validate the fixes before shipping. Replaying captured requests with tampered payloads, testing the SQL injection patterns against the parameterized endpoints, probing the JWT gate with malformed and expired tokens. If Burp could break it, the fix wasn’t done. If it couldn’t, I had reasonable confidence the patch held. Having both the AI writing the defence and a real attack tool probing it created a tighter feedback loop than code review alone would have.

Which is, I think, the actual lesson: AI assistance in this kind of work isn’t about replacing engineering judgment. It’s about removing the friction between having a judgment and turning it into working, tested, and validated code. That friction — the “I know what needs to happen but now I need to write 80 lines of middleware, a test suite, and then manually click through the app to verify it” friction — is where a lot of security work quietly stalls.

It didn’t stall here.


Where Things Stand

The auth fixes are in production. The SQL injection remediation is in progress. The codebase has more documentation today than it did three months ago. The security findings are being closed.

The original team who built this shipped something that ran reliably for five years and served a large internal user base. That’s genuinely hard to do. What I’m doing now isn’t a criticism of them — it’s just the next chapter.

Legacy systems don’t need heroes. They need patience, curiosity, and someone willing to read a five-year-old routing config until it makes sense.


Written by a developer who spent way too long reading cookie scope documentation and came out the other side with opinions.

Leave a Reply

Your email address will not be published. Required fields are marked *