How to Track Sustainability on an AI Program Using GPM P5™

I’m applying GPM P5™ to a live enterprise AI enablement program — but the data belongs to my employer, not my blog. Here’s the framework I built so you can apply it to your own AI projects.

The Honest Version of “I’m Doing This”

Earlier this year I set a 2026 goal to apply GPM P5™ sustainability metrics to live IT projects — and report the results publicly.

I’m doing it. The project is real, the metrics are running, and five months in the data is genuinely interesting.

But the numbers belong to my employer. And before I publish specifics about an enterprise program — license counts, usage metrics, governance milestones — I need to clear that with the right people. That’s not a hedge. That’s just being a responsible PM.

So here’s what I can give you right now: the framework. How I thought about applying P5 to an AI enablement program, which metrics I chose and why, and how you’d set this up on your own project.

The data update will follow when I have clearance.

Why AI Enablement Is a Sustainability Problem

Most organizations treat AI rollouts as technology projects. Install the tool, assign licenses, send a training email, done.

That framing misses the real challenge — and creates sustainability problems that show up later as technical debt, security incidents, or adoption that stalls after the initial push.

Here’s the core issue: AI tools like Copilot are only as safe as the data governance underneath them. If employees have access to data they shouldn’t — and in most organizations, permissions have drifted over years of file sharing, org changes, and “just give them access” decisions — then an AI that searches across everything will surface things it shouldn’t.

So before you can safely expand AI access, you have to fix the foundation: data loss prevention, access governance, acceptable use policy, training. That work is not glamorous. It doesn’t show up in a Copilot demo. But skipping it is the definition of unsustainable — you’re building on a foundation that will require expensive remediation later.

That sequencing — governance before capability — is a sustainability principle. And it’s measurable using P5.

Building the P5 Metrics Framework

GPM P5™ organizes sustainability across five dimensions: People, Planet, Prosperity, Process, and Products. For an AI enablement program, here’s how I mapped each dimension to something measurable.

People: Is Anyone Actually Using This?

Adoption is the first sustainability test for any AI rollout. A tool nobody uses isn’t a failed project — it’s a waste of budget, change management effort, and organizational trust.

Metrics to track:

Active users (not just licenses assigned — actual usage)
License assignment vs. headcount (are you over- or under-provisioning?)
Training completion rate (did users learn to use it before getting access?)
Agent or workflow creation (are users going beyond basic prompting?)

Why training completion matters more than most teams think: Handing someone a powerful AI tool without training is not enablement — it’s a liability. Low training completion is a leading indicator of misuse, poor results, and eventual abandonment. I set a training completion target before expanding access, not after.

How to set your baseline: Pull license and active user counts at program start, before any training or communications push. That’s your true starting point.

Planet: Are You Saving Time or Just Moving Work Around?

For a software program, “planet” metrics aren’t about carbon directly. They’re about resource efficiency — specifically, the most wasted resource in most organizations: people’s time.

Metrics to track:

Meeting hours saved per month (available in Microsoft 365 Copilot Dashboard via Viva Insights for M365 deployments)
Document creation or editing time reduction (harder to measure, but worth tracking if you have the tooling)
Support ticket volume related to AI tool issues (a proxy for how much friction the rollout is creating)

The key question: Is AI saving time, or is it creating new work — reviewing outputs, correcting errors, managing exceptions? Both are real. Tracking meeting hours saved gives you one side of that equation. Support tickets give you the other.

How to set your baseline: Pull the Viva Insights dashboard at program start before any significant adoption has occurred. The baseline will be low — that’s expected. It becomes meaningful as adoption grows.

Prosperity: Is This Creating Real Value?

Adoption metrics tell you people are using the tool. Prosperity metrics tell you whether using it is generating value.

Metrics to track:

Total AI actions per month (a proxy for volume of work being assisted by AI)
Self-reported time saved per task (survey-based, but useful for qualitative validation)
Process cycle time before and after (if you can isolate a process that AI is meaningfully changing)

The honest challenge here: Prosperity metrics for AI are hard. The value is often diffuse — a little faster here, a little better there, across hundreds of users. You may not see a single dramatic before/after number. What you’ll see is directional movement across many small gains.

Track total actions per month as your leading indicator. It tells you whether the tool is being used to do real work, not just to generate the occasional email.

Process: Is the Governance Foundation In Place?

This is where most AI rollouts have their biggest gaps — and where P5 tracking adds the most value, because it forces you to measure the unglamorous work.

Metrics to track:

Email DLP coverage (% of email traffic monitored and governed)
File DLP coverage (% of file storage governed)
Acceptable use policy adoption (% of users who have acknowledged the policy)
SharePoint or data source search enablement (what can Copilot actually search, and what’s still blocked?)
Power Platform or automation governance (how many flows/apps are managed vs. unmanaged?)

Why unmanaged matters: “Unmanaged” Power Platform apps or flows are sustainability risks — they’re automation running in your environment with no oversight, no documentation, and no owner. Tracking managed vs. unmanaged gives you a picture of whether governance is keeping pace with adoption.

How to set your baseline: Inventory what’s governed and what isn’t at program start. A baseline of 0% DLP coverage is honest and useful — it tells you exactly how much work is ahead.

Products: What Have You Actually Delivered?

The product dimension asks the hardest question: what is the program actually delivering, and is it durable?

Metrics to track:

Copilot search scope (blocked vs. enabled for specific data sources)
Power Platform apps deployed to production (canvas + model-driven)
Governance policies published and in effect
AI use cases defined, piloted, and scaled

The blocker metric: I specifically track whether flagship capabilities are blocked or enabled. If Copilot can’t search SharePoint yet because file DLP isn’t complete, that’s a product milestone that hasn’t landed — and it belongs in the metrics table as “pending,” not quietly omitted.

Honest reporting on blockers is a sustainability practice. It prevents the organizational habit of treating “we deployed the tool” as success before the tool is actually usable.

Setting Up Your Own P5 Metrics Table

Here’s the structure I use. Adapt the metrics to your specific program.

P5 Area	Project/Workstream	Metric	Baseline	Target
People	Copilot	Active Users	[#]	[#]
People	Copilot	Training Complete	[%]	90%
Planet	Copilot	Meeting Hrs Saved/mo	[#]	TBD
Prosperity	Copilot	Total Actions/mo	[#]	TBD
Process	DLP	Email Coverage	0%	100%
Process	DLP	File Coverage	0%	100%
Process	Power Platform	Managed Flows	[#]	TBD
Products	Copilot	SharePoint Search	Blocked	Enabled
Products	Power Platform	Apps in Production	[#]	TBD

A few notes on filling this in:

Set TBD targets honestly. For metrics where you don’t know what’s achievable, start with TBD and set a real target after your first reporting cycle. Better to admit you don’t know than to invent a number.
Update every 1-3 months. These are trend metrics, not sprint metrics. Weekly updates create noise. Quarterly updates lose signal. Monthly or bimonthly is the right cadence for most programs.
Report the blockers. If a milestone is pending, blocked, or not started — put that in the table. The point is visibility, not a highlight reel.

What I’ve Learned Five Months In

I’ll share the actual numbers when I have clearance. But a few things I can say without the data:

P5 works on messy programs. AI Enablement doesn’t have a clean end date. Workstreams keep getting added. The scope keeps evolving. And yet the P5 framework gives me a consistent lens to answer one question every reporting cycle: are we moving in the right direction across all five dimensions? That question is answerable even when the scope is shifting.

Some of the most important metrics are the ones that don’t move. A metric stuck at zero or “blocked” is not a failure of the metrics framework — it’s the framework doing its job. It’s telling you where the work still is.

Training completion is a leading indicator for everything else. If your People metrics are strong — high training completion, strong active user rates — your Planet and Prosperity metrics will follow. If training is low, nothing else will hold.

The Data Update Is Coming

When I have clearance to share the actual numbers from the program I’m running, I’ll publish a follow-up post with the real data — baseline, targets, five-month results, and what surprised me.

Until then: take the framework, apply it to your own AI program, and let me know what metrics you’d add.

This is part of my 2026 public goals series. I’m documenting the journey from planning through results — including the parts I can’t share yet.

Related posts:

GPM P5™ Primer: What It Is and Why I’m Using It — the framework explained
AI + PMO: Manual Work to Eliminate — the efficiency case for AI in project management