Module 05 — Governance

SafeDeployment

Getting AI deployment wrong in a defence and infrastructure context has real commercial and reputational consequences. This final module gives you a complete, practical governance framework — how to choose your first pilot, how to stage the rollout, what data controls to put in place, how to manage your team through the change, and how to keep improving once agents are running.

The three deployment mistakes that sink AI pilots

Most AI pilots that fail don't fail because the technology doesn't work — they fail because of how they were deployed. The same three mistakes appear repeatedly across industries. Normoyle can avoid all of them.

Mistake 01 — Most common
Too much autonomy too fast

Giving agents permission to send emails, update registers, or issue documents before their accuracy has been verified on real work. The agent makes one bad call — an incorrect RFI, a wrong cost — and trust collapses across the whole team. Fix: start with read and draft only. Earn permissions progressively.

Mistake 02 — Most damaging
No named owner

Deploying an agent without designating a specific person who owns its output. When something goes wrong, everyone assumes someone else reviewed it. Fix: every agent output has a named human owner who reviews and signs. No exceptions.

Mistake 03 — Most overlooked
Data sent to the wrong tool

Using a free consumer AI tool for confidential project data — client names, pricing, defence specifications — without understanding where that data goes. Fix: match the tool to the data sensitivity. Consumer tools for non-confidential work only. API with data agreement for everything else.

Choosing the right first pilot

The first agent you deploy at Normoyle will shape the team's attitude toward all future AI tools. Get it right and you build confidence. Get it wrong — by picking something too complex, too sensitive, or too hard to verify — and you set the programme back months.

Use this scorecard to evaluate any proposed pilot. The higher the score, the better the candidate:

CriterionScore 1 — Poor fitScore 3 — Good fitEstimating pilot score
Frequency — how often does this task occur? Once or twice a year Weekly or more 3 — multiple RFQs per week
Verifiability — can you check the output against a known answer? Subjective — hard to say if right or wrong Objective — past results to compare against 3 — dozens of past quotes to test against
Inputs — how structured are the inputs? Ambiguous, conversational, highly variable Structured documents with consistent format 3 — PDFs and DXFs with defined content
Consequence of error — what happens if the agent gets it wrong? Immediate external consequence — safety, legal, client impact Internal draft — caught in review before any external impact 3 — estimator reviews before quote leaves business
Data sensitivity — what data does the agent touch? Classified, legally privileged, or highly confidential Internal commercial data with standard controls 2 — cost rates are sensitive but manageable
Team readiness — is the relevant team willing to try? Strong resistance — people feel threatened Curious and open — at least one champion 2 — mixed initially, but one estimator keen to try

The estimating pilot scores 16 out of 18 — an excellent first pilot. Compliance documentation scores 14–15. Start with estimating; move to compliance once the first agent is proven.

The four-stage pilot framework

Every new agent at Normoyle follows four stages before full deployment. This isn't bureaucracy — it's how you build the evidence base that justifies expanding the agent's permissions and the team's trust in its output.

  1. Stage 1 — Read only (weeks 1–2) The agent reads documents and reports what it finds. No drafting, no writing to registers, no output that anyone acts on. The purpose is purely diagnostic: does the agent correctly understand your drawings, your cost database format, your register structure? Run it against 5–10 past documents and compare its extraction to the known correct answers. Success metric: >90% accuracy on information extraction.
  2. Stage 2 — Draft with 100% review (weeks 3–8) The agent produces drafts. A designated reviewer checks every single output before it's used for anything — even internal purposes. The reviewer logs: accepted as-is, accepted with minor edits, accepted with major edits, or rejected. This log is your evidence base. Success metric: >90% of outputs accepted with minor or no edits over at least 4 weeks.
  3. Stage 3 — Draft with spot-check review (weeks 9–16) Review rate drops to 20–30% of outputs, selected randomly. The remainder gets a quick human scan — not a full check — before use. Only move to Stage 3 after Stage 2 success metrics are met. Continue logging every error found. Success metric: error rate below 5% on spot-checked outputs.
  4. Stage 4 — Full deployment with audit trail The agent runs as a standard business tool. Every output is logged with a timestamp, the prompt version used, and the reviewing person. Monthly audit: the agent owner reviews a random sample of 10 outputs for quality and consistency. Immediate rollback protocol if error rate rises above threshold. Success metric: sustained <5% error rate; no undetected errors reaching clients.
The rollback rule

Before any agent goes into Stage 3 or 4, define in writing: what error rate triggers a rollback to the previous stage? Who makes that call? What does rollback look like in practice? A rollback is not a failure — it's the system working. The failure is discovering errors after they've left the business.

Data security rules for Normoyle

Not all AI tools handle data the same way. The wrong tool for the wrong data is one of the most common and most serious mistakes in AI deployment. This table defines Normoyle's rules — it should be shared with every team member who uses any AI tool for work.

Data typeConsumer tools
(ChatGPT free, Claude.ai free)
Business API tools
(Claude Team/Business, API)
On-premises only
General engineering knowledge
Standards explanations, general drafting help
PERMITTED PERMITTED Not required
Anonymised drawings
Client name and project removed
PERMITTED PERMITTED Not required
Internal cost rates and margins NOT PERMITTED PERMITTED — with data processing agreement Optional additional protection
Client names and live project details NOT PERMITTED PERMITTED — with data processing agreement Not required for standard projects
Supplier pricing from NDAs NOT PERMITTED PERMITTED — check NDA terms first Preferred for high-sensitivity pricing
Mill certificates and traceability data NOT PERMITTED PERMITTED — internal tools only Required for some defence programmes
Defence project specifications NOT PERMITTED CHECK CLASSIFICATION FIRST REQUIRED if classified
Personnel and HR data NOT PERMITTED NOT PERMITTED NOT IN SCOPE — ever
What "data processing agreement" means in practice

Anthropic's Claude Team and Business plans include a data processing agreement (DPA) — a contractual commitment that your data won't be used to train their models and will be handled according to defined security standards. This is what makes it appropriate for confidential commercial data. The free consumer tier of Claude.ai and ChatGPT do not include a DPA. If you're unsure which tier Normoyle is on, check with management before uploading any confidential data.

Human-in-the-loop: the review gates

Every agent workflow at Normoyle must have defined review gates — specific points where a human checks the output before it proceeds. These gates are not optional and don't get removed as agents improve. They are the mechanism by which Normoyle maintains professional accountability for every document it produces.

Normoyle agent review gate framework
GATE 1 — Internal quality check
  Trigger:  Agent produces any output (quote, NCR, RFI, report)
  Who:      Designated reviewer for that agent type
            Estimating: lead estimator
            Compliance: project engineer
            Delivery docs: PM or senior project engineer
  Check:    Does the output make sense?
            Are flagged items genuine issues?
            Are numbers, references, and dates correct?
  Time:     5–20 min depending on output complexity
  Action:   Accept / edit / reject and regenerate

GATE 2 — Before any external communication
  Trigger:  Any document or email leaving Normoyle
  Who:      Project engineer or PM (never delegated to junior)
  Check:    Full content review — professional judgement applied
            Tone appropriate for recipient and relationship?
            No admissions, commitments, or liability statements?
            Drawing references correct revision?
            Programme impact statements accurate?
  Time:     Same as you would spend writing it from scratch
            "The agent did it" is not a reason to review faster
  Action:   Approve and send / edit and send / do not send

GATE 3 — Before expanding agent permissions
  Trigger:  Request to give agent new capability
            (e.g. send emails, update live register, new data type)
  Who:      Business owner + whoever manages IT/security
  Check:    Stage 2 accuracy metrics met?
            Is the new permission genuinely necessary?
            What could go wrong? What is the rollback?
            Has the system prompt been updated and retested?
  Time:     Formal meeting — minimum 30 min
            Document the decision and the conditions

GATE 4 — Monthly performance audit
  Trigger:  Monthly (calendar reminder — don't skip)
  Who:      Named agent owner for each deployed agent
  Check:    Random sample of 10 outputs from the past month
            Any errors that weren't caught in review?
            Any pattern of near-misses or recurring issues?
            Has the system prompt drifted from the approved version?
            Any new risk factors (new project types, new clients)?
  Time:     1–2 hours per agent per month
  Output:   One-page audit note filed in the quality system

Managing the team through the change

Technology adoption fails most often not because the technology doesn't work — but because the people using it don't trust it, don't understand it, or feel threatened by it. At Normoyle, the change management approach is as important as the technical deployment.

The two fears — and how to answer them honestly

When you introduce AI agents at Normoyle, two concerns will come up. Don't dismiss them — they're legitimate. Address them directly:

Fear 01 — Job security
"Will this replace me?"

The honest answer: For the tasks Normoyle is automating — data extraction, template filling, document drafting — the agent handles the mechanical work. The skilled work — the judgement calls, the client relationships, the engineering decisions — stays with people. What changes is that skilled people spend more time on skilled work. That's the point.

Fear 02 — Accuracy
"What if it gets something wrong?"

The honest answer: It will get things wrong, especially early. That's exactly why every output goes through a review gate before it's used. The agent's errors get caught. The review process is the safety net — and it's non-negotiable regardless of how good the agent gets.

What to say to each part of the team

For estimators

The estimating agent handles the part of quoting that takes hours but doesn't need your expertise — reading drawings, building the BOM, looking up rates. Your time moves to reviewing the flagged items, applying your experience to the hard calls, and focusing on the bids most worth winning. The agent makes you more productive at the work only you can do. Your sign-off on every quote means your professional judgement is still the last word.

For project engineers

You remain the professional owner of every document that leaves this business. The agent drafts — you decide, edit, and sign. Your NER registration, your professional responsibility, and your relationship with the client are unchanged. What changes is that you're not spending three hours writing an RFI from a blank template. You spend 10 minutes reviewing a well-structured draft and applying your judgement to the parts that need it.

For the site supervisor and shop floor

The agents operate in the office — on drawings, documents, and procurement data. The fabrication work, the welding, the installation — that's all unchanged. What you might notice is that the office team has more time to resolve issues quickly, chase deliveries more proactively, and get you the information you need on site faster.

Normoyle's recommended rollout plan

This is a concrete 12-month plan. Adjust timing based on actual Stage 2 results — don't advance before the success metrics are met, but also don't stay in earlier stages longer than necessary once they are.

PeriodAgentStageSuccess metric to advance
Months 1–2 Estimating agent Stage 1–2: Read only, then 100% draft review >90% BOM accuracy on test quotes; >90% drafts accepted with minor edits
Months 3–4 Estimating agent Stage 3: Spot-check review on standard job types <5% error rate on spot-checked outputs; no errors reaching clients
Month 3 Compliance agent Stage 1–2: Read only on a current project (non-classified) Findings match manual review; no missed non-conformances in testing
Months 5–6 Compliance agent Stage 2–3: Draft NCRs and checklists with full review >90% of drafted NCRs accepted with minor edits
Month 5 RFI drafting agent Stage 1–2: Start on one active project with one PM champion PM reports time saving; no RFIs sent with errors
Months 7–9 Procurement agent Stage 1–2: Read PO register, daily status report, draft emails PM uses report daily; draft emails require minimal editing
Months 10–12 All agents Stage 3–4: Full deployment on standard work types Monthly audit shows sustained accuracy; team uses agents without friction

Keeping agents consistent: prompt version control

As agents are used in production, their system prompts will be edited — rules added, constraints clarified, output formats refined. Without version control, a single bad edit can silently degrade an agent that was working well, and you won't know until errors appear in output.

  1. Keep a dated prompt log for every agent A simple text file or shared document: date, version number, what changed, and why. Before editing any system prompt, copy the current version to the log. This takes 2 minutes and means you can always roll back.
  2. Test before deploying any prompt change Run the updated prompt against at least 3 past test cases before using it on live work. A prompt change that fixes one problem sometimes breaks another. Never deploy an untested prompt on a live project document.
  3. One person owns each agent's prompt The "agent owner" is the only person who edits the system prompt. Others can request changes — but the owner reviews, tests, and deploys. Without clear ownership, prompts get edited by multiple people and drift into inconsistency.
  4. Document the approved prompt version in your QMS For compliance and defence work especially: the version of the system prompt used to produce a document should be recorded alongside the document itself. This is your audit trail if a client or auditor questions how a document was produced.

Measuring what matters

AI deployment at Normoyle should show measurable results. Track these metrics from week one so you have evidence of impact — both for internal confidence and to justify continued investment.

Metric 01
Time per quote

Track estimator time on each quote before and after the agent. Target: 70% reduction in data-extraction and drafting time. Measure from drawing receipt to draft-ready-for-review.

Metric 02
Draft acceptance rate

Percentage of agent drafts accepted with minor or no edits. Target: >90% in Stage 2 before advancing. Track by agent type — estimating, compliance, RFI — separately.

Metric 03
Error escape rate

Errors that got through review and reached a client or external document. Target: zero. Any escape is a serious event requiring immediate prompt review and process debrief.

Metric 04
Overdue procurement items

Number of POs past due date at any given time. The procurement agent should drive this down by catching at-risk items earlier. Track weekly — the trend matters as much as the number.

Metric 05
NCR close-out time

Time from NCR raised to NCR closed. The compliance agent speeds up the raising and documentation — but close-out still requires engineering resolution. Track to ensure agents aren't creating a backlog.

Metric 06
Team adoption rate

How many team members are actively using each agent tool after 3 months? Voluntary adoption is the real measure of success. If people aren't using it without being told to, something is wrong — with the tool, the training, or the change management.

Hands-on exercise: your deployment plan

Exercise — allow 45 minutes (group activity)

This is the capstone exercise for the course. Each person or small group produces a one-page deployment plan for the estimating agent pilot. Bring it to the final session for group review.

  1. Name your agent owner Who at Normoyle will own the estimating agent? They write and maintain the system prompt, run the monthly audit, and are the escalation point if something goes wrong. Write down the name.
  2. Define your Stage 1 test set List 5 past quotes you'll use to test the agent before it touches live work. Choose a range: two simple jobs, two medium complexity, one complex. Write down the job reference and what you'll measure.
  3. Set your Stage 2 success threshold What acceptance rate do you need to see before moving to spot-check review? (Recommended: 90%.) What does "accepted with minor edits" mean in your context? Write it down so there's no ambiguity.
  4. Define your rollback trigger At what error rate will you roll back from Stage 3 to Stage 2? Who makes that call? How quickly? Write it down.
  5. Identify your data controls Which Claude plan will Normoyle use for the estimating agent? Is a data processing agreement in place? Which data types will the agent touch — and are they all permitted under the rules in this module?
  6. Write three sentences for your team Using the change management guidance above, write what you'll say to: (a) the estimator who'll use the agent, (b) the PM who signs off quotes, and (c) any team member who asks "will this replace my job?"

Knowledge check

At what point should you move the estimating agent from Stage 2 (100% review) to Stage 3 (spot-check review)?
A Normoyle project engineer wants to use the free version of ChatGPT to help draft an RFI that references the client's name, the project contract number, and specific drawing details. Is this permitted under Normoyle's data rules?
The estimating agent's system prompt was edited yesterday to fix a problem with powder coat pricing. This morning, three quotes have come out with labour rates that look wrong — the agent seems to be using old rates. What most likely happened and what should you do?
Six months into deployment, a Normoyle estimator tells you they've stopped using the estimating agent because "it takes longer to review than to just do it myself." What does this most likely indicate?
A team member asks: "Will the estimating agent eventually replace estimators at Normoyle?" What is the most accurate and honest answer?