Why a 94% Citation Hallucination in Grok-3 Forced a Rethink of Factuality Benchmarks
https://send.now/8x39r1kn4671
Grok-3 hit 94% citation hallucination while the FACTS benchmark reported a 68.8 score — hard numbers that changed production risk estimates The data suggests the situation was worse than the vendor materials implied