Digital Event Horizon
A new study by Columbia Journalism Review's Tow Center for Digital Journalism reveals that generative AI models used for news searches exhibit rampant inaccuracies, with more than 60 percent of queries about news sources being answered incorrectly. The study raises concerns about reliability and highlights the need for greater scrutiny of these models.
Generative AI models used for news searches have alarming accuracy issues, with over 60% of queries about news sources resulting in incorrect answers. AI models often provide confabulations, providing plausibly sounding but incorrect or speculative answers. Premium paid versions of these AI search tools fare worse than their free counterparts, with higher overall error rates due to reluctance to decline uncertain responses. AI tools ignore Robot Exclusion Protocol settings, leading to unauthorized access and issues with publisher control. Citations from AI sources often direct users to syndicated versions of content rather than original publisher sites, even in cases with formal licensing agreements. URL fabrication is a significant problem, with over half of citations resulting in fabricated or broken URLs.
In a recent study published by Columbia Journalism Review's Tow Center for Digital Journalism, researchers have uncovered alarming accuracy issues with generative AI models used for news searches. The research tested eight AI-driven search tools equipped with live search functionality and discovered that the AI models incorrectly answered more than 60 percent of queries about news sources.
The study, which ran 1,600 queries across the eight different generative search tools, revealed a common trend among these AI models: rather than declining to respond when they lacked reliable information, the models frequently provided confabulations – plausibly sounding incorrect or speculative answers. This behavior was consistent across all tested models, not limited to just one tool.
The researchers found that premium paid versions of these AI search tools fared even worse in certain respects. Perplexity Pro ($20/month) and Grok 3's premium service ($40/month) confidently delivered incorrect responses more often than their free counterparts. Although these premium models correctly answered a higher number of prompts, their reluctance to decline uncertain responses drove higher overall error rates.
Furthermore, the study highlighted issues with citations and publisher control. The researchers found that some AI tools ignored Robot Exclusion Protocol settings, which publishers use to prevent unauthorized access. For example, Perplexity's free version correctly identified all 10 excerpts from paywalled National Geographic content, despite National Geographic explicitly disallowing Perplexia's web crawlers.
Moreover, the study revealed that even when these AI search tools cited sources, they often directed users to syndicated versions of content on platforms like Yahoo News rather than original publisher sites. This occurred even in cases where publishers had formal licensing agreements with AI companies.
The researchers also discovered that URL fabrication emerged as another significant problem. More than half of citations from Google's Gemini and Grok 3 led users to fabricated or broken URLs resulting in error pages. Of 200 citations tested from Grok 3, 154 resulted in broken links.
These issues create significant tension for publishers, which face difficult choices. Blocking AI crawlers might lead to loss of attribution entirely, while permitting them allows widespread reuse without driving traffic back to publishers' own websites.
Mark Howard, chief operating officer at Time magazine, expressed concern to Columbia Journalism Review about ensuring transparency and control over how Time's content appears via AI-generated searches. Despite these issues, Howard sees room for improvement in future iterations, stating, "Today is the worst that the product will ever be," citing substantial investments and engineering efforts aimed at improving these tools.
However, Howard also did some user shaming, suggesting it's the user's fault if they aren't skeptical of free AI tools' accuracy: "If anybody as a consumer is right now believing that any of these free products are going to be 100 percent accurate, then shame on them."
OpenAI and Microsoft provided statements to Columbia Journalism Review acknowledging receipt of the findings but did not directly address the specific issues. OpenAI noted its promise to support publishers by driving traffic through summaries, quotes, clear links, and attribution. Microsoft stated it adheres to Robot Exclusion Protocols and publisher directives.
The latest report builds on previous findings published by the Tow Center in November 2024, which identified similar accuracy problems in how ChatGPT handled news-related content.
In conclusion, the recent study by Columbia Journalism Review's Tow Center for Digital Journalism highlights the need for greater scrutiny of generative AI models used for news searches. The alarming accuracy issues uncovered in this study underscore the importance of publishers taking a proactive approach to ensuring transparency and control over their content in the face of rapid technological advancements.
Related Information:
https://www.digitaleventhorizon.com/articles/A-New-Wave-of-Concern-Generative-AI-Models-Rampant-Inaccuracies-Exposed-deh.shtml
https://arstechnica.com/ai/2025/03/ai-search-engines-give-incorrect-answers-at-an-alarming-60-rate-study-says/
Published: Fri Mar 14 02:42:47 2025 by llama3.2 3B Q4_K_M