All 64 authenticated screens, seven lenses, 76 AI agents, and one adversarial pass whose only job was to kill false positives. 346 findings. The bones are real engineering taste, not template slop. But two systemic defects each break a promise the product makes about itself: accessible in both themes, and built for any industry. This is exactly what we found, and the plan to clear it.
One audit agent walked each of the 64 authenticated project screens against the product spec and a senior craft standard, scoring seven lenses: accessibility, performance, responsive behaviour, theming, visual craft, information architecture and copy, and a primary one this round, industry-neutrality. A separate analysis traced how hard-wired the app is to its six seeded engineering domains.
We did not eyeball 64 screens. Each screen's findings were then re-checked against the actual source by a different, skeptical agent, which discarded roughly forty percent of the first-pass high-severity findings as false positives or already-handled. So the blockers below are verified against real code, not first-pass guesses.
Structurally sound and built with real taste: one cyan accent, monospaced tabular numbers, hairline rules, a proper per-theme token system, density without clutter. This does not read as a generic template under the hood. The damage is concentrated, and the weakest lens by far, accessibility, is lifted across dozens of screens by just three shared fixes.
Every one of the 346 first-pass findings was then re-checked against source across two adversarial passes. 328 confirmed (9 P0, 53 P1, 145 P2, 121 P3); 18 dropped as false-positive or already-handled. Nothing here is directional. Anti-slop verdict: fail as shipped, but from a small set of fixable tells (repeated eyebrow labels on about eighteen screens, one glassmorphism legend, a few generic empty states, run-on copy, and the hard-coded domain data below), not from pervasive slop.
These are not 64 separate problems. Both are centralised, and both have centralised fixes. Each one breaks a promise SyntheraOS makes about itself.
The finding is encouraging: the core ontology is neutral. A thin shell couples the app to its six seeded domains (water/desalination, energy, marine, aerospace, infrastructure, "other"). The fix is a preset registry, a custom path, and replacing hard-coded branches with data-driven ones. The six stay as presets, no one is locked out of them.
Turn the domain type from a fixed six-member union into a string id carrying its own vocabulary, standards and sample seeds. Ship the six as presets over a neutral core, and add a first-class "describe your industry" card in the new-project wizard with sensible neutral defaults.
EFFORT · LThe assistant brief reads from the project's own risk gaps. Risk prediction reads structural signals in the graph (single-source components, missing failover links, unlinked compliance, long-lead items, unverified requirements) with sector libraries as optional packs. Never silently fall back to desalination.
EFFORT · LMake verification stages project-driven, so a software project shows unit / integration / UAT and a pharma project shows IQ / OQ / PQ with no code change. Drive readiness-gate titles, digital-twin lifecycle copy and role names from per-industry label maps with neutral fallbacks.
EFFORT · LWiden the standards catalogue, add a software seed and a pharma seed to prove that adding an industry equals adding one more preset, and add a regression check that an unrecognised industry never renders desalination data anywhere.
EFFORT · MSequenced on purpose. The shared roots land first, so the per-screen work does not fight itself, and the largest accessibility and trust gains come from the cheapest, lowest-risk changes.
The three P0 roots
The -ink token swap across the colour helpers (about 28 screens), one global :focus-visible rule (no keyboard focus renders anywhere today, this fixes about 20 screens at once), and stripping the hard-coded desalination data from the AI paths. Mostly mechanical, low risk.
The DomainPreset registry
The industry-neutrality root from the section above. More invasive, but it is the single change that makes the product genuinely sector-agnostic.
Remaining accessibility and trust
Pair colour-only state with an icon or label plus ARIA, accessible names on form controls, recompute a risk's derived score in the same action that edits it, make the graph, tree and Gantt operable by assistive tech, the scroll pattern on clipping tables, missing empty states, and fix the open-issues count.
Polish
Migrate the hard-pixel font sizes to the rem scale for proper 200% zoom, and remove the eyebrow labels and the graph-legend glass effect. The visual tells that read as machine-made.
A deterministic, verifiable operation, not a long manual read. Seventy-six agents across the run, every claim checked by a different agent than the one that made it.
Each of the 64 screens got a dedicated reviewer scoring all seven lenses against the spec and a senior craft bar, returning structured findings with exact file and line locations.
A second skeptical agent re-opened the real file for every finding, P0 through P3, and tried to refute each one. 18 of 346 were discarded as false positive or already-handled; the remaining 328 are confirmed against the code.
A dedicated analysis read the intake, generator, intelligence and seed layers to map exactly where the app is hard-wired to a sector, and how to neutralise each point.
Scores, counts and the systemic patterns were aggregated in code, not guessed, then a high-effort lead reviewer wrote the cross-screen verdict and the remediation order.
An engineering OS should tell you the truth, so here is the one caveat about this report itself.
Every finding, all 346, was verified against the actual code, so nothing here is directional. The one honest caveat is the method itself: this is a static read of the source, not a runtime test. Contrast figures like ~1.9-2.5:1 are computed from the design-token values, not measured live in a browser, and the keyboard and screen-reader findings are read from the markup rather than driven through assistive tech. The next rung, if you want measured proof, is a Playwright and axe pass on the live build with real per-theme contrast and a keyboard walk of the graph, tree and timeline. The complete per-screen record, every finding with its location, impact and fix, is saved alongside this report.