How a Construction Ops Lead Cut 5 Hours of Document QA a Week

Most paperwork does not get rejected because it is wrong in some obscure way.

It gets rejected because someone forgot to sign page 3, or the insurance certificate expired last week, or the totals on the invoice do not match the line items.

Boring stuff. Easy to spot. Easy to miss.

That is why document QA is one of the most automatable jobs in a small business. It is also one of the most ignored, because it does not feel like a “real” automation problem. It feels like a task you should just handle.

Until you add up the hours.

This post walks through how one operations lead at a mid-size general contractor turned a slow, repetitive paperwork review job into a rules-based bot. The setup saves her about five hours every week and ends with a clean HTML report that tells her, at a glance, which subcontractor packets are good to go and which ones need to bounce back.

You can copy the same pattern for any document QA job. Invoices. Lease packets. Loan files. Closing disclosures. Vendor onboarding. The shape is the same.

The job that was eating her week

Renee runs ops at a regional general contractor. Every week she gets a stack of subcontractor packets that have to be reviewed before anyone gets paid or set foot on a job site.

A typical packet has:

A certificate of insurance (COI)
A W-9
A signed subcontractor agreement
A lien waiver
Sometimes a safety acknowledgement form

She was getting 20 to 30 of these a week. Each one took about 10 to 15 minutes if nothing was wrong. Longer if something was off and she had to email the sub to fix it.

That is roughly 5 hours of pure document review every week. Not negotiation. Not relationships. Not strategy. Just checking that boxes were filled in correctly.

The annoying part is that the rules almost never changed. She was looking for the same things every time.

Why this work is harder than it looks

Document QA sounds simple from the outside. Read a PDF. Check a few fields. Move on.

The reason it eats time is that each packet hides a few small jobs:

Extract the data from the PDF or image.
Compare the data against your rules.
Note exactly what is wrong, if anything.
Decide whether to approve, reject, or ask for a fix.
Track the result somewhere your team can see it.

If the document is a clean digital PDF, step 1 is easy. If it is a scanned image, a phone photo of a printed form, or a screenshot pasted into an email, it is not.

Most QA workflows fall apart on step 1 first, then on step 2 because nobody wrote the rules down clearly.

Renee had a checklist in her head. She had never written it out. So when her assistant tried to help, she would still end up re-checking everything.

The mental shift: a bot with a soul, plus one rule per task

Before any of this gets automated, you need to do the boring part. Write the rules down.

Renee sat down for an hour and listed every reason she had ever rejected a packet. She ended up with about a dozen rules. Some examples:

The insured name on the COI must match the legal entity on the W-9.
General liability coverage must be at least $1M per occurrence and $2M aggregate.
The certificate holder must be listed as our company, not blank.
The COI expiration date must be at least 30 days in the future.
The W-9 must have a TIN filled in and be signed.
The subcontractor agreement must be signed by both parties on the last page.
The lien waiver must reference the correct project name.

This list is the actual product. The bot is just the thing that runs it.

The way she structured it on BotHound was simple:

The bot soul is an inspector. Its job is to be skeptical, catch issues, and never approve something that is missing evidence.
Each task is one rule from the checklist.
The last task writes an HTML pass/fail report that summarizes the whole packet.

That is the whole structure. One rule per task. No clever orchestration. No mega-prompt trying to do everything at once.

The bot soul

The soul sets the tone for every task in the bot. For QA work, you want the soul to lean cautious.

Here is roughly what Renee used:

You are a document QA inspector for a general contractor.

Your job is to review subcontractor packets and catch issues before they reach accounting or the project manager.

You are skeptical by default. If a field is unclear, missing, or hard to read, you flag it. You do not guess.

You never approve a rule unless the evidence in the document clearly supports it. "Probably fine" is a fail.

When you flag an issue, you point to the exact field, page, or line where the problem is. You quote the document when possible.

You are not rude. You are precise. You write findings the way an auditor would: short, factual, and easy to act on.

That is the whole soul. It is not long. It does not need to be.

The point of the soul is to set the standard for evidence and tone. Every task inherits it.

One rule per task

This is the part that makes the bot actually trustworthy.

Each task is a single rule, written as a prompt. The bot reads the document, checks the rule, and returns a structured result.

A task prompt looks like this:

Rule: The certificate of insurance (COI) must list general liability
coverage of at least $1,000,000 per occurrence and $2,000,000 aggregate.

Look at the COI in the attached packet. Find the General Liability section.

Return:
- status: "pass" or "fail"
- per_occurrence_found: the exact number you see, or "not found"
- aggregate_found: the exact number you see, or "not found"
- evidence: a short quote from the document showing where you found these numbers
- notes: if it fails, explain why in one sentence

If the document is unreadable or the section is missing, status is "fail"
and notes should say so.

Notice what the prompt does not do.

It does not ask the bot to also check the W-9. It does not ask it to summarize the packet. It does not ask it to make a judgment call about whether the sub is “good.” It does one thing.

That is why this works. The smaller the rule, the easier it is to test, fix, and trust.

Renee has about a dozen of these. Each one is short. Each one returns structured output. If a rule starts misfiring, she edits that one task and leaves the rest alone.

Handling the extraction problem

A lot of document QA setups break before they ever get to the rules, because the document is a scanned image or a low-quality PDF.

There are two ways to handle this:

Let a vision-capable model read the document directly.
Run a dedicated extraction step first that turns the document into clean text or structured fields, then run rules against that.

For Renee’s packets, option 1 was fine. The COIs and W-9s are usually clean PDFs from the sub’s broker. When something is a phone photo of a printed page, the bot flags it as unreadable and she handles that one by hand.

If you are dealing with worse inputs (warehouse receipts, handwritten notes, faxed forms), option 2 tends to be more reliable. You add a first task whose only job is extraction, and every rule task reads from its output instead of the raw file.

The general rule: do not ask one prompt to both read a messy document and apply complex logic to it. Split the work.

The HTML pass/fail report

The last task in the bot is the one Renee actually opens.

It takes the structured results from every rule task and turns them into a single HTML report. One row per rule. Green if pass. Red if fail. A short evidence quote next to each one. A summary at the top showing the overall verdict and the sub’s name.

The prompt is roughly:

You are given the structured results from each rule check on this packet.

Generate a self-contained HTML document that summarizes the QA results.

Requirements:
- A header with the subcontractor name, project name, and date reviewed.
- An overall verdict at the top: "PASS" if every rule passed, "FAIL" if
  any rule failed, "NEEDS REVIEW" if any rule returned "unreadable".
- A table with one row per rule. Columns: rule name, status, evidence,
  notes.
- Failed rows highlighted in red. Passed rows in green.
- Inline CSS only. No external assets.
- No JavaScript.
- Plain, printable layout. Should look fine if saved as a PDF.

Do not invent data. Only use what was returned by the rule tasks.

The output is a single HTML file. She can open it, forward it to the sub, or save it to the project folder. If every rule passes, she approves the packet and moves on. If anything fails, the report already tells the sub exactly what to fix and where.

That last part is the unlock. Before, she had to write the rejection email herself. Now the report is the rejection email.

Why this saves real time

The rough math:

Before: 25 packets a week, 12 minutes each, about 5 hours.
After: 25 packets a week, 1 to 2 minutes each on the ones that pass, 5 minutes on the ones that fail.

Most packets pass. The bot does the reading. Renee skims the report, clicks approve, and moves on.

The hours saved are not the only win. The other one is consistency. The bot checks every rule on every packet. A human reviewer in a hurry will sometimes skip a step. The bot does not get tired at 4pm on a Friday.

The third one is paper trail. Every run produces an HTML report that gets saved with the packet. If something goes wrong later, there is a record of what was checked and what the evidence was.

Where this pattern fits beyond construction

The same shape works for almost any rules-based document review. Some examples:

Accounts payable. Match invoice totals to PO line items. Flag missing tax IDs. Catch duplicate invoice numbers.
Real estate closings. Verify names match across the deed, title commitment, and closing disclosure. Check that signatures and dates are on the right pages.
Loan packets. Confirm income documents are recent. Check that disclosures are signed. Compare addresses across forms.
Vendor onboarding. Make sure W-9s, COIs, NDAs, and banking forms are all present, current, and consistent.
Insurance claims intake. Check that police reports, photos, repair estimates, and proof of ownership are attached and legible.
Compliance review. Verify that required clauses appear in contracts. Flag missing arbitration language or out-of-date templates.

If the job sounds like “I read the same pile of forms every week and check the same things,” it is probably a fit.

How to build something like this without overdoing it

A few habits make this kind of bot much easier to get right.

Write the rules in plain English first. Before you touch any tool, list every check you do today. If you cannot write it as a sentence, the bot will not be able to enforce it either.

Keep one rule per task. It is tempting to combine related checks. Resist it. Small tasks are easier to debug and easier to trust. When a rule misfires, you want to fix one thing, not unwind a 20-line prompt.

Demand evidence in every output. Every rule task should return a quote, a number, or a field reference. If the bot cannot show its work, treat it as a fail.

Make the report opinionated. A pass/fail verdict at the top of the report saves more time than the rule checks themselves. The whole point is that you should not have to read the details unless something failed.

Start with one document type. Do not try to handle every flavor of paperwork at once. Pick the one that eats the most of your week and build for that. Add more later.

Where BotHound fits

This is the kind of workflow BotHound was built for. A bot with a clear soul. One task per rule. Tools for reading PDFs and images. A final task that writes the report. A history of every run so you can see what was checked, what was found, and what the verdict was.

You do not need a custom platform to do document QA. You need somewhere to keep your rules, run them in order, and see the results. BotHound gives you that without making you stitch tools together yourself.

The bot is not the product. The rules are. The bot is just what runs them every time so you do not have to.

The takeaway

Document QA is not glamorous work. That is exactly why it is worth automating.

It happens on a schedule. It follows the same checklist every time. It produces the same kind of output. And it usually ends with someone reading a stack of PDFs at the end of the week wondering where their afternoon went.

Write the rules down. Give each one its own task. Let the bot read the documents, check the rules, and write the report.

You are not replacing judgment. You are replacing the part of the job that did not need judgment in the first place.