AI audit vs rubric — Shopify Collective
An independent Workers AI LLM scored Shopify Collective against the same published rubric. The deterministic rubric result is our canonical score. The LLM's result is shown here as a sanity check — never mixed into the scoring formula.
| Dimension | Rubric | LLM | Δ (LLM − Rubric) |
|---|---|---|---|
| Pricing transparency | 84 | 45 | -39 |
| Business transparency | 85 | 80 | -5 |
| Shipping clarity | 80 | 65 | -15 |
| Public reviews | 85 | 0 | -85 |
| Product range | 75 | 70 | -5 |
| Access & onboarding | 95 | 90 | -5 |
| Support track record | 80 | 0 | -80 |
| Store integrations | 55 | 0 | -55 |
| Overall | 82 | 46 | -36 |
What this means: Large disagreement — investigate. The LLM read the published signals very differently from the deterministic rules.
Median per-dimension |Δ| = 27.
reviewScore 0 due to < stars or no reviews, integration 0 due to Manual / API only, support 0 due to Consistently poor support feedbackThis is the LLM's own explanation, not editorial commentary from SupplierSpy. The LLM result is a sanity check on the rubric — never mixed into the scoring formula.