How the Bureau scores tools and classifies news
This page covers the complete methodology in two parts. Jump to the section you need.
StackScore Tools™
Every AI tool is scored across 4 intelligence layers by a team of specialist agents. No single agent decides. Rank applies a fixed formula and Pulse audits every run before a score goes live.
Rank is the only agent that writes the final stackscore. The weights below are fixed — no agent can override them.
stackscore = ROUND( operational_score × 0.40 // 40% — Can it improve real workflows? trust_score × 0.25 // 25% — Can it be trusted operationally? market_score × 0.20 // 20% — Does it matter in the ecosystem? infrastructure_score × 0.15 // 15% — Can it anchor a durable AI stack? )
Does it work in real workflows?
Can you trust it with real data?
Does it matter in the ecosystem?
Can it anchor a durable stack?
Every score has a confidence value (0–1). Low confidence = fewer sources, high variance, or missing data. Rank cannot inflate confidence — it can only cap it down.
base = average(operational_conf, trust_conf, market_conf, infra_conf) penalties: −0.10 if ANY dimension evidence_count < 3 −0.15 if score spread (max_dim − min_dim) > 35 −0.05 if ANY dimension confidence < 0.60 −0.08 if total evidence_count < 10 bonuses: +0.05 if ALL dimension evidence_count >= 8 +0.03 if ALL dimension confidence >= 0.75 floor: 0.40 ceiling: 0.97 (cannot exceed 0.90 unless evidence_count ≥ 12)
Rank sets these badges on each tool after every evaluation. They appear on tool pages.
9 agents run in order for every tool evaluation — Insta directs, 8 specialists execute. Each step writes to agent_runs.
StackScore News™
Every AI news article is classified by Flash before it appears on the site. Flash must fetch the full article, find corroborating sources, and assign a credibility label. Flash cannot score from a headline alone.
Flash assigns exactly one label to every article. The label appears as a badge on the news feed and on each article page.
Flash scores every article across 7 dimensions. Credibility is the primary driver of the narrative label; Signal-to-Hype Ratio captures the ratio of operational evidence to promotional language. All 7 scores are stored and displayed on article pages.