Data Privacy in AI Projects—What You Can and Can't Share

Your AI tool is only as trustworthy as what you put into it. Here’s how to think about what stays out.

Accounting isn’t slow to adopt AI because they don’t see the value. HR isn’t dragging their feet because they’re technophobic.

They’re cautious because they know exactly what’s in their files.

Payroll data. Benefits elections. Performance reviews. Compensation bands. PII for thousands of employees. The kind of data where one misstep isn’t a process problem—it’s a legal problem, a trust problem, and potentially a headline problem.

I’ve seen this firsthand during an enterprise Copilot rollout I was part of. While other departments moved quickly, accounting and HR moved deliberately. The data governance team stepped in with specific guidance: here’s what you can put into Copilot, and here’s what you cannot. That guidance made the difference between paralysis and progress.

If your organization is in the middle of an AI rollout—or you’re a PM working with teams that handle sensitive data—this framework is what you need.

Why Enterprise AI Is Different (But Not a Free Pass)

Before we get into what data to protect, one clarification that matters: there’s a meaningful difference between consumer AI tools and enterprise-licensed tools like Microsoft Copilot for Microsoft 365.

Consumer tools (ChatGPT free, Claude.ai, Gemini) typically use your inputs to improve their models. Data you enter may be retained, reviewed, and used for training. This is why “don’t paste confidential data into ChatGPT” became the first AI policy most organizations wrote.

Enterprise tools (Copilot for M365, Claude for Enterprise, ChatGPT Enterprise) operate under business agreements with different data handling terms. Microsoft Copilot for M365, for example, does not use your organizational data to train its models. Your prompts and responses stay within your Microsoft 365 tenant.

This distinction matters. But enterprise licensing doesn’t mean anything goes. The data still exists somewhere in your environment. Your prompts become part of audit logs. Outputs can be copied, shared, or screenshot. The governance principles still apply—the risk profile is just different.

The Three Categories to Protect

1. Employee Data

This is where accounting and HR are right to pump the brakes.

Employee data includes anything that identifies, describes, or affects a person in your organization:

Names, addresses, SSNs, dates of birth
Compensation, salary bands, bonus structures
Performance reviews and disciplinary records
Benefits elections and medical accommodations
Termination details and separation agreements

The risk isn’t just that AI might expose this data externally. It’s that generating outputs based on employee data creates a record. A prompt like “summarize the performance issues for these five employees and suggest a PIP structure” puts sensitive HR information into your AI interaction history. Even in an enterprise tool, that’s a governance question your HR and legal teams need to answer—not a decision individual PMs should make on their own.

Practical guidance: If you wouldn’t send it in an unencrypted email, don’t put it in an AI prompt.

2. Proprietary and Confidential Business Data

This category is broader than most people initially think:

Unreleased product roadmaps and pricing strategies
M&A activity, partnership negotiations, contract terms
Trade secrets, formulas, proprietary processes
Financial projections and internal cost structures
Customer data, contracts, and purchase histories
Competitive intelligence and market analyses

PMs are particularly exposed here because we work across functions. We have access to information from engineering, finance, legal, and sales simultaneously. That breadth is what makes us effective—and it’s also what makes data governance for PMs more complex than for most roles.

The question to ask before prompting: If this information appeared in a competitor’s strategy document next quarter, how bad would that be? If the answer is “very bad,” it stays out of the prompt.

3. Personally Identifiable Information (PII)

PII is the category most people recognize, but it’s easy to underestimate how much of it flows through a normal project:

Customer names, emails, phone numbers
Account numbers, transaction histories
Health information (if your organization handles it)
Demographic data collected for any purpose
Location data, IP addresses, device identifiers

For PMs managing customer-facing projects, PII shows up everywhere—in requirements documents, UAT scenarios, defect reports, and meeting notes. The temptation is to use real data because it’s faster. Resist it. Anonymize or synthesize data for AI-assisted work. Your GDPR, CCPA, HIPAA, or applicable privacy obligations don’t pause because you’re using an AI tool to help.

What This Means for Your Projects

Data privacy in AI isn’t just an IT or legal problem. It’s a PM problem, because PMs are the ones who set team norms, run meetings, and decide how work gets done day to day.

A few practical things you can do right now:

Know what tool your team is using and what tier it is. Consumer, enterprise, or something else? The governance rules are different, and your team needs to know which applies.

Ask your data governance or legal team for guidance. Like Kwik Trip’s accounting and HR teams, you may be waiting on clarity that already exists. Ask specifically: “What can we put into [tool name] and what can’t we?”

Set norms for your project. In your kickoff or team norms conversation, include a line about AI and data. “When using AI tools on this project, we do not include [specific data types]. Use synthetic or anonymized data for testing and documentation work.”

Build the habit of anonymizing before prompting. Replace real names with “Employee A,” real dollar amounts with representative figures, real customer data with invented examples. You get the same AI assistance with none of the risk.

When I was managing GDPR compliance at Microsoft across Windows and Internet Explorer, one of the harder parts wasn’t identifying the big, obvious data risks. It was the data that teams hadn’t thought of as personal data—log files, diagnostic telemetry, feature usage patterns—that turned out to be in scope.

AI governance is following the same pattern. The obvious risks (don’t paste employee SSNs into ChatGPT) are obvious. The harder work is identifying the data that flows through your projects that you haven’t classified yet. Where does PII appear in your requirements documents? What proprietary information is embedded in the project charter your team is using as AI context?

The teams that get ahead of AI governance aren’t the ones who are most restrictive. They’re the ones who’ve actually mapped their data, understood their tools, and given their people practical guidance instead of a policy document nobody reads.

Accounting and HR will get to full AI adoption. They’re just doing it right.

Links & References

Related Monday Business Posts:

Building an AI Usage Policy Your Team Will Actually Follow — the policy framework that makes data governance actionable
AI Ethics for PMs — the professional responsibility questions that come after you’ve handled the data questions
5 PM Tasks AI Does Surprisingly Well — what AI is actually good at, so you know where to use it safely

Coming Soon:

Your AI Usage Is Being Logged — what your organization can see about how you’re using AI tools

External Resources:

National Institute of Standards and Technology AI Risk Management Framework — the federal framework for AI governance
Microsoft Copilot for M365 Data Privacy — how enterprise Copilot handles your data

The teams that pause to get data governance right aren’t behind. They’re building the foundation that lets them go faster later—safely.