Generated: 2026-04-27 16:11 UTC
Source: Pano cluster combined audit (/tmp/combined_audit.json)
Headline numbers
- Entities scanned: 75
- With
goals >= 4: 75 / 75 (100%) - With clean labels (no leading G-id, length < 100, non-empty): 73 / 75 (97.3%)
- With at least 5 goals (target: 5 per entity): 75 / 75 (100%) Net read: the goals-stage refactor (plain-text + tolerant regex parser) that landed earlier in the project is holding up across the registry. 97.3% clean coverage is in the same band as the post-fix RIOH coverage.
Bottom 5 by goal-label cleanness
The audit's goals_clean metric counts goals whose label is non-empty, length < 100 chars, and doesn't begin with a literal G (which would suggest the regex left an embedded ID prefix in the label):
| Slug | Goals | Clean | Notes |
|---|---|---|---|
optum-360 | 5 | 1 | 4 of 5 labels are malformed; needs reprocessing |
palantir-technologies | 5 | 3 | 2 labels likely have leading-G artifacts |
komodo-health | 5 | 4 | 1 label off; minor |
marsh-mclennan | 5 | 4 | 1 label off; minor |
prudential-financial |
- Reprocess
optum-360— 4 of 5 goal labels are malformed. This is the worst case in the registry. - Spot-check Palantir Technologies — distinct from
palantirslug; possibly a duplicate-entity issue rather than a parsing issue. - The other 3 entities are at 4/5 clean, which is acceptable but worth re-running on next continuous-research cycle.
Recommended fix
Mirror the RIOH-stage approach: add a tighter "is this label sensible" detector to the goals parser:
- Strip leading
G\d+\s+artifacts in_GOAL_LOOSE_REpost-extraction - Cap label length at 80 chars (currently uncapped)
- If label appears to begin with
[past|current|future]lane keyword, strip it The existing strict regex catches >95% of cases. The remaining failures are similar to the RIOH 7-field-merged pattern: Gemma occasionally drops the em-dash separator and concatenates label + description.
Per-entity goal-coverage histogram (clean count)
- 5 / 5 clean: 70 entities (93.3%)
- 4 / 5 clean: 4 entities (5.3%)
- 3 / 5 clean: 1 entity (1.3%)
- <= 2 clean: 1 entity (1.3%)
Cross-stage consistency
For every entity with a goals stage, the riohs stage references those goals via goals: ["G1","G3"] style fields. Spot-check confirms that the cross-references hold — every G-id referenced in a RIOH goals array has a matching G-id in the goals stage. This is a useful invariant for future graph-integration work.
Recommended action sequencing
- Force-refresh
optum-360on next focused backfill (low cost, fixes the worst case) - Add a goals-label sanity check to the audit pipeline as a continuous-cycle metric
- Defer the goals-parser-tightening until we observe the next batch of new entities — it's not worth a redeploy for the existing 5 marginal cases Co-Authored-By: Oz oz-agent@warp.dev