The Conversation That Kills Programmes

Christopher Clarkson
May 19
6 min read

CAXA Technologies Security Operations Series: Vulnerability Management

At a recent client, the vulnerability management programme had everything it was supposed to have: a well-integrated scanner, EPSS enrichment, tools existed to aggregate findings with SLA classifications attached. The MTTR numbers were poor. Not because the findings were wrong. Because no engineering team owns the SLAs. The security team was chasing fixes across squad boundaries with no escalation path. The remediation model was that security asked nicely.

The fix was not a tooling change. It was a set of conversations with engineering leadership.

This is the failure mode that does not show up in a tooling review.

The ownership gap nobody names

Episode 4's operating model established people, process, and technology as the three components a VM programme requires to deliver the Five Pillars. What it did not fully address is who adopts that model on the other side: who in engineering has been explicitly tasked with the response.

Most programmes have a clear owner for detection. The security team runs the scanners, tunes the prioritisation model, publishes findings. What is rarely made explicit is what happens next. The finding is published. Who owns the SLA? Who decides whether it goes in this sprint or the next? Who escalates when it ages past threshold?

In the absence of an explicit answer, both teams assume the other is acting. Security assumes engineering will respond to findings the way engineering responds to incidents, treating them as delivery obligations. Engineering assumes security will follow up and chase, the way a project manager follows up on a request. Neither assumption is correct. The finding sits in a queue. The backlog grows.

This is not a tooling problem. It is a structural one, and it has a name in NIST CSF 2.0's Govern function: GV.RR-02: "Roles, responsibilities, and authorities related to cybersecurity risk management are established, communicated, understood, and enforced."

Most programmes satisfy the first three words of that requirement. Understood and enforced is where the gap lives.

Whose capability, whose response

The ownership split maps cleanly onto the vulnerability lifecycle if you make it explicit. Security owns the capability layer. Engineering owns the response layer. The boundary between them is triage. This applies to organisations with distinct security and engineering functions; in smaller or embedded teams where those roles overlap, the boundary looks different, but the ownership question is the same.

Detection capability. Security defines the scanning coverage, operates the tooling, applies the prioritisation model (EPSS enrichment, KEV classification, SLA tiering) and publishes findings to the remediation owner. This layer is unambiguously security's.

Triage. Security defines the model: a remediation window per severity tier, set by the organisation's risk appetite, regulatory obligations, and operational capacity. The specific values vary; the structure does not. Engineering applies that model to their systems and their sprint cadence. The model is not renegotiated per finding. It is agreed once, at programme level, between security leadership and engineering leadership. If an engineering team wants to challenge a classification, the channel for that is the exception process, not a finding-by-finding conversation.

Remediation. Engineering owns this completely: the fix, the deployment, the SLA, and the escalation when the SLA is at risk of breach. Security validates closure via rescan, but the action is engineering's. For vendor-dependent vulnerabilities, where the fix requires a third-party patch, engineering owns the risk acceptance and compensating control decisions while the patch is unavailable, and owns the deployment once it is available. The SLA must sit with engineering, not security. If security owns the SLA, security's job becomes chasing engineering for compliance. That produces reports. It does not produce closed findings.

This is the specific structural change most programmes need. Not a new scanner. Not better enrichment. A different answer to: whose job is it to fix this?

What the ownership conversation produces

The output of this conversation is a RACI for the vulnerability lifecycle. Most organisations have none of it documented. The absence is not accidental; it is easier to leave ownership implicit than to have a conversation in which engineering leadership formally accepts accountability for remediation SLA compliance.

A few rows carry most of the weight.

SLA for closure sits with engineering. Non-negotiable. Security reports on it; engineering is accountable for it.

Exception approval sits with engineering leadership (a VP, EM, or named delegate), with security defining the criteria and the maximum exception lifetime. An exception process with no expiry date is a risk acceptance process that produces permanent deferrals. A backlog of exceptions with no expiry dates and no named reviewer is almost always a symptom of undefined ownership, not genuine risk acceptance. Where security and engineering leadership cannot agree on a risk acceptance decision, the escalation path runs to the CISO or a named risk committee, not back to a finding-by-finding negotiation.

SLA breach escalation sits with engineering. Security escalates to the named owner; the named owner decides whether to remediate, defer, or formally accept risk. In most organisations, security does not have the standing to force a sprint allocation. Engineering leadership does.

Validation sits with security. Once engineering closes a finding, security rescans to confirm. The closure is not recorded until validation completes.

At a previous client, operating a large technology estate spanning multiple cloud providers, the ownership problem was multiplied across dozens of engineering teams. The CNAPP tooling was producing findings at a volume the security team could not triage alone, and squads receiving raw scanner output had no framework for deciding what to act on. The resolution was a tiered model: security owned the triage classifications; each team owned their asset’s findings; platform engineering owned the infrastructure layer; application squads owned code and dependency findings. The most significant driver of MTTR improvement was knowing whose queue the finding was in, not faster scanning.

The socialisation layer

Once the RACI is agreed, it has to be lived. That is an engineering leadership function, not a security function.

The security team can produce the model. It cannot embed it. The signal that security SLAs are delivery obligations rather than requests from another team has to come from VPs and EMs: through how they prioritise the backlog, through what they measure their teams on. When engineering leadership treats remediation SLA compliance as an engineering metric, it embeds. When they don’t, it doesn’t, regardless of the tooling.

Security champions programmes have a role here, but a specific one. Champions are effective when a security culture already exists: when leadership has made the commitment explicit, when the delivery capacity for remediation is real, when the RACI is documented. In that context, a champion per squad sustains and extends the culture at team level: they own the triage conversation in their squad and have the standing to request sprint time for findings above threshold. They amplify an existing culture. They do not create one. If the leadership conversation has not happened, a champions programme will not compensate for its absence.

Why programmes plateau at Level 1

Episode 11's maturity model defined observable security behaviours for each of the Five Pillars across four levels. The weakest-pillar principle established that effective programme maturity is set by the lowest-scoring Pillar.

The most common sticking point after Pillar 1 is Pillar 4: Remediation and Mitigation. A programme operating at L1 on Pillar 4 handles findings reactively: when they become urgent, not according to a defined SLA, with no named owner. The L1-to-L3 progression on Pillar 4 is not a tooling progression. It is the result of the ownership conversation being had, documented, and enforced.

Most programmes that plateau at L1 are not short on tooling investment. They have scanners, enrichment, a prioritisation model. They have not had the conversation that determines whether any of that investment produces closed findings.

The diagnostic

There is a quick way to know whether an organisation has had this conversation: ask who owns the SLA for remediating a Critical finding.

If the answer is "the security team," or if it takes more than a few seconds to name anyone at all, the conversation has not happened. The security team can own the model. It cannot own the SLA for fixing infrastructure and code it does not control.

If the answer is immediate, specific, and comes with a named escalation path for when the SLA is at risk of breach, the conversation has happened. The Programme Schematic, which follows in this series, formalises this as a template an organisation can complete for their own environment: ownership assignments, exception criteria, escalation paths, and SLA structures adapted to their operating model.

The most expensive thing in a VM programme is not the tooling. It is the backlog of findings that nobody owns.