Case study · Construction

A support copilot built on a fragmented knowledge base.

A $200M construction firm consolidated six SharePoint sites, three legacy wikis, and a Slack archive into a retrieval system its helpdesk uses every day. We built it in three months and have run it on retainer since.

Hours returned
22h/wk
on the helpdesk team after 60 days
Tier-1 deflection
52%
measured against pre-rollout baseline
CSAT delta
+4 pts
copilot answers vs. human baseline, month four
Engagement
3 mo + retainer
in production for 11 months and counting

The situation

A general contractor with eleven offices across the southwest. Roughly 800 employees in the field, another 240 in operations, finance, and project management. Eighteen years of accumulated institutional knowledge spread across six SharePoint sites, three different wiki tools (two of them retired but still indexed by Google), and somewhere north of a million Slack messages.

The helpdesk that supported it was three people. Their day was, in their own words, "ninety percent finding the right document and ten percent answering the question." The ten percent was the interesting part. The ninety percent was where the firm was losing real time.

What we were asked to do

Initially: "build us a chatbot for our knowledge base." We declined that framing and proposed a discovery embed first.

The diagnosis, two weeks later, was that the knowledge was not actually broken. It was just unreachable. Roughly 70% of helpdesk tickets had a correct answer somewhere in the existing documentation; the helpdesk just could not find it fast enough, and field employees had given up trying. The right intervention was not "a chatbot." It was a retrieval system the helpdesk used as a tool, with a confident enough surface to also serve field employees directly.

What we built

  • A unified retrieval layer. Connectors into all six SharePoint sites, the active wiki, and an archive snapshot of the two retired ones. Incremental sync. Re-indexing on document change, not on a schedule. Source attribution preserved end to end.
  • A Slack-archive ingestion pipeline with thread-aware chunking and aggressive de-duplication. We learned quickly that 60% of the historical Slack signal was noise; the eval suite ended up doing most of the work of separating the two.
  • An answering layer with confidence gating. Three-tier response: confident answer with citations, partial answer with "you should verify this with…", and explicit escalation to the human helpdesk with the relevant documents pre-attached.
  • Identity-aware permissions. A field foreman sees field documentation. A finance manager sees finance. Sensitive HR documents are never retrievable through the copilot regardless of permissions.
  • An operator dashboard for the helpdesk lead. Top failure modes, document coverage gaps, sources used most often. The first thing she looks at in the morning.

What did not work the first time

Two things, both worth saying out loud.

First, our initial retrieval was tuned for recall over precision and the responses were too long. The helpdesk reported the copilot as "technically correct and operationally useless" in week three. We rebuilt the answering layer to prefer a short, sourced answer with a "show more" surface for the supporting documents. Satisfaction jumped immediately.

Second, we underestimated how much old documentation was simply wrong. Procedures that had been updated in a Slack thread two years ago and never made it back to SharePoint. We had to add a contradiction-detection eval, when two sources said different things about the same procedure, the system surfaced the conflict and asked the helpdesk to resolve it. This turned out to be the most valuable feature in the system: in six months it forced a corrected single source of truth on 340 distinct procedures.

The system did not just answer the helpdesk questions. It found 340 places where the company was telling itself two different things.

Where it is now

Eleven months in production. Currently running on a monthly retainer with our team. The helpdesk lead owns the operator dashboard and runs a weekly fifteen-minute review with us. We run the eval suite nightly and bring regressions to that review.

Field adoption has grown from 18% of employees in month one to 71% by month nine. The most common usage pattern is not the one we expected: foremen using it on jobsite tablets to look up safety procedures and equipment specs in the moment, not the helpdesk-mediated workflow we initially designed for.

What the monthly business review looks like

One slide. The hours returned chart. The deflection chart. The top three regressions this month and what we did about them. The top three opportunities to expand. The CFO gets it. The CIO uses it as the talking point for the AI program with the board.

Next step

A case study like this starts with discovery.

Two weeks. One senior engineer, one analyst, your team, a written diagnosis at the end. Most of our long-term engagements began here.