▸ Internal · MCG / Kelly
AI Governance — Roadmap
Internal program dashboard. Enter the access code.
Invalid code.
Current: v1.0
AI Governance Program

Roadmap

How the program evolves — every version, what changed and why, who contributed, and what we address next. Plain English first, a little tech underneath.

Current release

v1.0 — The leadership pitch

The version presented to the Kelly/MCG AI group: one declarative package — the governance-led motion, the proof it's built, the meter (cost & efficiency), the ecosystem due diligence, and the live pipeline shown honestly as reference leads.

  • The pitch document — ten sections, opportunity → ask, reviewed by a five-perspective expert panel (governance standards, consulting GTM, Fortune-500 buyer, AI-cost economics, presentation strategy) before release. Read it ↗
  • Standards posture hardened — we never self-certify; formal certification routes through accredited third-party bodies, our work produces the audit-ready evidence.
  • Sources & evidence registry — every figure across the program's surfaces now links to a verified source with fetched URLs and dates. The registry ↗
  • Ecosystem due diligence — full product-and-category read on Nexthink AI Drive: complement, not competitor; potential partnership on the table.
  • Pipeline as reference leads — two live conversations (a property-management SaaS, a major health plan): in progress, proposals forthcoming, nothing closed.
  • Version record made visible — the prior pitch draft archived, the program history rendered, both linked from v1.0.
Previous release

v0.9 — The SecOps merge

★ Credit

This release is built largely on Sean's work — Kelly SecOps.

Sean ran his own open-source toolset research and brought back a strong, focused stack. v0.9 merges it into ours. His list filled the exact gaps we were thinnest on — automatic policy enforcement, compliance evidence for certifications, and a full AI audit trail. A large piece of v0.9 is his thinking and delivery.

Six additions from Sean's research — each here for a reason:

Open Policy Agent (OPA)

Policy-as-code
Plain English

An automatic rule-checker. It decides, in real time, who and what is allowed to use a given AI model, dataset, or system — and blocks the rest.

Under the hood: CNCF-graduated, Apache-2.0; authorization + compliance policies written as code and evaluated at request time.

Why we merged it: v0.8 could detect problems but couldn't enforce rules. OPA is the enforcement layer we were missing.

OSCAL (NIST)

Compliance auto
Plain English

Turns "are we compliant?" from a manual scramble into a button. It tracks every security control and the proof for it, in a format auditors accept.

Under the hood: NIST's machine-readable control-mapping + evidence format; automates assessment against ISO 42001 / NIST.

Why we merged it: certifications were our weakest stage. OSCAL fixes it directly.

Langfuse

LLM audit log
Plain English

A flight recorder for the AI — every prompt, answer, cost, and who did what. The trail you show when someone asks "what did the AI do, and when?"

Under the hood: MIT-licensed; prompt/response/latency/cost tracing with evaluation workflows.

Why we merged it: a genuine miss — we had monitoring, but not a clean audit trail. This is it.

OpenTelemetry

Telemetry std
Plain English

The common plumbing that carries all the monitoring data to one place — so security and compliance dashboards actually get fed.

Under the hood: CNCF standard for traces/metrics/logs across apps + infra; feeds SIEM + compliance reporting.

Why we merged it: the open backbone we implied but never named. Names it.

Apache Atlas

Data lineage
Plain English

Tracks where data came from, where it went, and who owns it — so you can prove your AI was trained and run on the right data.

Under the hood: Apache-2.0 metadata + lineage + classification repository.

Why we merged it: rounds out data governance for audit, alongside OpenMetadata / DataHub.

OWASP WSTG / PTK

Pentest method
Plain English

The step-by-step recipe for security-testing the apps around the AI — so every engagement tests the same way, every time.

Under the hood: OWASP Web Security Testing Guide — the repeatable methodology behind tools like ZAP/Burp.

Why we merged it: we had the testing tools but not the repeatable method. This is the rinse-and-repeat part.

The confidence signal: both teams searched independently and landed on the same 13-tool core — garak, PyRIT, promptfoo, OWASP ZAP, Arize Phoenix, AIF360, Fairlearn, SHAP, LIME, OpenMetadata, DataHub, Datasheets, + OWASP LLM Top 10. When two independent searches converge, those are the non-negotiables.

What we kept from v0.8 (and didn't drop)

Sean's doc was intentionally open-source only. We kept what his scope didn't cover, because a Fortune-10 program needs it:

  • Runtime defense (NeMo Guardrails, LLM Guard, Llama Guard) — his tools test for problems; these prevent them in production. The other half.
  • Model supply-chain security (ModelScan, safetensors, signing) — malicious-model / pickle risk.
  • Secure-SDLC / Copilot (Semgrep, CodeQL, Trivy) — relevant: BCBS runs Copilot.
  • Model registry, AI discovery, modern AI-BOM, commercial system-of-record — needed for enterprise scale + the auditable record.

Corrected in v0.9: Google Model Card Toolkit was on Sean's list as current — it's been archived (read-only) since Sep 2024. We use Hugging Face model cards / CycloneDX instead. (No fault — these move fast; it's exactly why we cross-check.)

What v0.9 locks

The rinse-and-repeat formula

One repeatable process for every engagement. Sean's additions notably strengthen Implementation (OPA), Certifications (OSCAL), and Maintenance (Langfuse, OTel).

01

Discovery

Inventory + lineage, threat-model (ATLAS/OWASP), garak sweep, fairness baseline, ISO 42001 gap.

02

Implementation

SoR + registry, guardrails, OPA, PII, supply-chain, Copilot/SDLC, OTel.

03

Test

garak→PyRIT→promptfoo, ART, ZAP/Burp via OWASP WSTG, fairness.

04

Training

Prompt standards + CoE, model cards + datasheets, OWASP LLM Top 10 curriculum.

05

Certifications

ISO 42001, OSCAL evidence, AI-BOM, Langfuse audit trail.

06

Maintenance

Observability (Langfuse/Phoenix/OTel), CI red-team, drift, OPA + OSCAL refresh.

Open gap: Training (stage 04) is thin in both toolsets — it's process + people, not tooling. A real build-out item before v1.0.

Previous

v0.8 — Consolidation baseline · June 1, 2026

The first time everything was pulled together, documented, and committed for the team.

  • govrn.ai offering doc — sharpened, co-branded, redesigned to a left-nav document (live, keywalled).
  • BCBS-of-Illinois pitch — in their brand, trademark-safe (live, keywalled).
  • Tools & Ammunition v0.1 — ~50 web-verified tools, OSS vs commercial.
  • BCBS brand capture, name-clearance (MEDIUM–HIGH), outreach email draft.
  • Packaged into the private GitHub repo for the team.
Next

What we address next → v1.1

v1.0 shipped. The queue now is what makes it client-ready at full strength.

  • Cleared case studies (the headline item) — four real, anonymized MCG engagements are staged as the backing bench: a top-3 US telecom agentic-AI deployment, a HIPAA-compliant healthcare AI delivery, an embedded AI/MLOps team at a $12B-scale energy company, and a global biotech forecasting modernization. Each enters the pitch as it clears for use (healthcare first).
  • Commercial sizing with leadership — deal shape, pricing, and margin targets for the three-phase motion.
  • Team redlines — sharpen the live pitch + offering; SecOps validates the merged kit.
  • Name decision — counsel on "govrn.ai" (MEDIUM–HIGH) before any public/branded use.
  • Build out Training (stage 04) — the thin spot in both toolsets; process + enablement, not just tools.