# AI Factory V2 — P2P Verifier Metrics and Quality Report

# AI Factory V2 WP3 — P2P Verifier Metrics and Quality Report

Generated: 2026-05-26T15:28:35.483Z

## Scope

Source DB: `/Users/makinja/system/databases/company-mesh.db`

Included MC tasks:

- #101987 — LumisCare notification-service migration pilot context
- #102081 — AI Factory V2 WP1 runner MVP
- #102083 — AI Factory V2 WP4 writeback reliability

## Metrics Summary

- Threads analyzed: 24
- Acceptable thread responses (answered + PASS/PARTIAL/ANSWERED): 5
- Attempt-level acceptable rate: 20.8%
- Response classes: `{"ANSWERED":3,"NO_RESPONSE":3,"BLOCKED":16,"PASS":1,"PARTIAL":1}`
- Failure patterns: `{"none":4,"stale_delivered_or_no_response":3,"timeout_or_worker_no_response":7,"agent_runner_or_ollama_failure":3,"blocked_unspecified_or_claim_gate":5,"partial_due_summary_only_evidence":2}`

## By Task

- #101987: total=6, acceptable=2, blocked=1, no_response=3, cost_cap_sum=$6.00
- #102081: total=6, acceptable=1, blocked=5, no_response=0, cost_cap_sum=$2.00
- #102083: total=12, acceptable=2, blocked=10, no_response=0, cost_cap_sum=$7.15

## Thread Detail

| Task | Thread | Status/class | Acceptable | Pattern | Prompt chars | Latency s | Evidence |
|---|---|---|---|---|---:|---:|---|
| #101987 | mesh-thr-8b3552e3-4f58-4f9f-a4b2-82b6ec8dbfc4 | answered/ANSWERED | yes | none | 416 | 1554 | `/Users/makinja/system/rules/p2p-pair-migration.md` |
| #101987 | mesh-thr-2170a2ba-3019-4c82-9bde-af102d38dd8f | answered/ANSWERED | yes | none | 507 | 253 |  |
| #101987 | mesh-thr-9392faa2-2d7a-40ad-9017-4ada9190bbd2 | open/NO_RESPONSE | no | stale_delivered_or_no_response | 447 |  |  |
| #101987 | mesh-thr-bf0d9685-c54a-44e1-acb9-55d22590fe8d | blocked/BLOCKED | no | timeout_or_worker_no_response | 753 | 64 | `/tmp/alai/company-mesh-timeouts/mesh-msg-a5b6f8fb-16e3-4519-a382-6a8b181e3b28.json` |
| #101987 | mesh-thr-61154c1b-4b74-4b93-a92e-2d1beb295c65 | open/NO_RESPONSE | no | stale_delivered_or_no_response | 506 |  |  |
| #101987 | mesh-thr-9ab9ece8-f33a-4fdb-9d29-ef1bb681667f | open/NO_RESPONSE | no | stale_delivered_or_no_response | 518 |  |  |
| #102083 | mesh-thr-b5873415-a389-4f26-a810-1d3cdf13a2c4 | blocked/BLOCKED | no | agent_runner_or_ollama_failure | 718 | 92 | `/tmp/alai/company-mesh-auto-responder/2026-05-26T13-29-09-784Z-mesh-msg-4b045b56-b9e9-421b-9336-d51e6c1166da.json` |
| #102083 | mesh-thr-b3f219e7-7dbf-41ac-b2a2-9d1e501126dc | blocked/BLOCKED | no | timeout_or_worker_no_response | 719 | 122 | `/tmp/alai/company-mesh-timeouts/mesh-msg-8f5314b3-426d-4de6-a0d7-c8964b85e358.json` |
| #102083 | mesh-thr-792068a5-74ec-40d8-988a-0d6d297339ba | blocked/BLOCKED | no | timeout_or_worker_no_response | 484 | 123 | `/tmp/alai/company-mesh-timeouts/mesh-msg-5ae99557-5984-4b5f-a37c-1586c89a6af3.json` |
| #102081 | mesh-thr-9cbebdf3-79f5-4201-80af-2bbd64d35ec4 | blocked/BLOCKED | no | timeout_or_worker_no_response | 1205 | 123 | `/tmp/alai/company-mesh-timeouts/mesh-msg-355ee365-5af6-4fb3-ba7a-59cdb3673483.json` |
| #102081 | mesh-thr-f07042ae-b529-4907-b844-e25f1b21a12b | blocked/BLOCKED | no | agent_runner_or_ollama_failure | 869 | 78 | `/tmp/alai/company-mesh-auto-responder/2026-05-26T14-01-02-501Z-mesh-msg-7a217112-2969-453d-8225-86d25e8fb23a.json` |
| #102083 | mesh-thr-6a5c9d97-df2e-4352-9b74-cf5db7c7bb40 | blocked/BLOCKED | no | blocked_unspecified_or_claim_gate | 266 | 16 | `/tmp/alai/company-mesh-auto-responder/2026-05-26T14-01-42-724Z-mesh-msg-2bf0c206-b599-4cda-990f-258ded567271.json` |
| #102083 | mesh-thr-57b70489-5ebb-4e91-a7a0-9d2a7e868497 | answered/ANSWERED | yes | none | 289 | 93 | `/tmp/alai/company-mesh-auto-responder/2026-05-26T14-03-31-501Z-mesh-msg-ed34a16c-5b49-4beb-ad46-db59696b948b.json` |
| #102083 | mesh-thr-dc65ed91-e027-4cf8-931c-ff5f55b43a49 | blocked/BLOCKED | no | blocked_unspecified_or_claim_gate | 1255 | 120 | `/tmp/alai/company-mesh-auto-responder/2026-05-26T14-06-46-587Z-mesh-msg-d9bfaf85-5817-49cb-bbe4-3f6c5c7802de.json` |
| #102081 | mesh-thr-5929968f-3eb5-41d6-8a79-643dc544ed05 | blocked/BLOCKED | no | timeout_or_worker_no_response | 957 | 123 | `/tmp/alai/company-mesh-timeouts/mesh-msg-34032090-9fb5-4b3e-b169-a945d1468848.json` |
| #102081 | mesh-thr-ef7498c1-c7b8-46c3-b533-d711a3616274 | blocked/BLOCKED | no | timeout_or_worker_no_response | 440 | 154 | `/tmp/alai/company-mesh-timeouts/mesh-msg-fd5a837d-c8c3-46ad-b2bb-6fc38c16d58d.json` |
| #102083 | mesh-thr-ecac2a6d-92ac-480e-b66e-d809aa0e6e04 | blocked/BLOCKED | no | agent_runner_or_ollama_failure | 1780 | 75 | `/tmp/alai/company-mesh-auto-responder/2026-05-26T14-16-50-228Z-mesh-msg-d90e62e3-bf6d-43da-825e-0e18abaf8d13.json` |
| #102081 | mesh-thr-526b7560-9278-4722-93ca-985d70e7a590 | blocked/BLOCKED | no | blocked_unspecified_or_claim_gate | 641 | 124 | `/tmp/alai/company-mesh-responder/2026-05-26T14-22-08-866Z-mesh-msg-c370552b-9c14-4737-bc9a-b36ccbcdb01a.json` |
| #102083 | mesh-thr-c99828fd-f6d8-447f-99dc-f779cd412bb3 | blocked/BLOCKED | no | timeout_or_worker_no_response | 1568 | 223 | `/tmp/alai/company-mesh-timeouts/mesh-msg-7a537962-f6f0-418a-93b8-32a317dd882a.json` |
| #102081 | mesh-thr-5cbbadc8-e238-4017-9b54-800c5088a0e9 | answered/PASS | yes | none | 38779 | 151 | `/tmp/alai/company-mesh-responder/2026-05-26T14-27-57-032Z-mesh-msg-431fd915-c305-4336-99be-0f1ca3e1ac8e.json` |
| #102083 | mesh-thr-4ec294f5-d1c2-43fe-98d9-2e7aaeb0953f | blocked/BLOCKED | no | blocked_unspecified_or_claim_gate | 1204 | 139 | `/tmp/alai/company-mesh-auto-responder/2026-05-26T14-28-23-453Z-mesh-msg-5e69f9b7-0b5a-4186-a8a6-866a3f612c18.json` |
| #102083 | mesh-thr-33334359-3e83-4343-bbda-342f7304bdee | blocked/BLOCKED | no | blocked_unspecified_or_claim_gate | 655 | 85 | `/tmp/alai/company-mesh-auto-responder/2026-05-26T14-31-00-220Z-mesh-msg-e1fc9798-e0eb-482e-978b-b97d086be757.json` |
| #102083 | mesh-thr-84961884-24e9-406b-bc36-bda72f807441 | blocked/BLOCKED | no | partial_due_summary_only_evidence | 563 | 44 | `/tmp/alai/company-mesh-auto-responder/2026-05-26T14-34-51-053Z-mesh-msg-43d28653-f4b4-47a5-9229-9338be4c30d1.json` |
| #102083 | mesh-thr-f759f9d2-a62d-491d-9ecb-677fcfd808fd | answered/PARTIAL | yes | partial_due_summary_only_evidence | 622 | 184 | `/tmp/alai/company-mesh-auto-responder/2026-05-26T14-38-26-267Z-mesh-msg-766b4c5e-cae6-444c-a09d-cf42398dc903.json` |

## Quality Findings

1. **Path-only prompts are weak verifier inputs.** Several early Claude/agent-runner attempts blocked or timed out when the verifier did not have enough pasted evidence or reliable read access.
2. **Pasted artifact prompts improved outcome quality.** MC #102081 passed only after a sanitized pasted-artifact prompt with implementation evidence and code excerpts.
3. **Responder mode matters.** Proveo/eval using Claude review produced usable ANSWERED/PARTIAL outcomes after routing and max-turn/read-only fixes; agent-runner/Ollama path produced blocked failures.
4. **Timeouts are the dominant reliability issue.** Timeout/worker-no-response is the largest failure pattern in this sample.
5. **PARTIAL is useful and honest.** MC #102083 returned PARTIAL because artifact summaries were read but commands were not re-run; that is preferable to false PASS.

## Recommendation

Hold controlled rollout. Keep P2P mandatory for H/risky tasks, but do not auto-send at dispatch until responder reliability and evidence-pack prompts are improved. Require pasted or readable evidence bundles for Claude-review verifiers.

## Proposed Rollout Rules

- Keep current controlled rollout for H/backend/core/security/user-facing/deploy-impacting tasks.
- Do not enable automatic Company Mesh verifier send at dispatch yet.
- For required P2P, generate a compact evidence bundle before verifier prompt.
- Prefer Claude-review verifier mode for Proveo on evidence-heavy reviews; keep agent-runner as fallback only when local model health is known.
- Treat PASS/PARTIAL/ANSWERED with evidence paths as acceptable pre-verifier states; BLOCKED/timeout must not satisfy MC ready/done.
- Track retry count and first-success attempt in future runner evidence.

## Evidence Artifacts

- Metrics JSON: `/Users/makinja/system/evidence/102080/p2p-verifier-metrics.json`
- This report: `/Users/makinja/system/evidence/102080/p2p-verifier-metrics-report.md`