Generative AI Meets Data Quality: Innovation or Risk?
Speaker
Longtian Zhang
Time
Friday, January 9, 2026 3:00 PM - 4:00 PM
Venue
A3-2-303
Online
Zoom 435 529 7909
(BIMSA)
Abstract
The widespread adoption of Generative AI raises concerns about potential risks, particularly those arising from excessive reliance on AI. This paper examines both the benefits and drawbacks of this emerging technology through the lens of data quality. We develop a semi-endogenous growth model in which production depends on two types of data: AI-generated data and producer data, the latter representing real-world information. Although AI-generated data are substantially cheaper to produce, their use involves a trade-off in the form of lower data quality, which leads to higher error rates in production. Our analysis shows that firms, operating under competitive equilibrium, tend to underutilize both types of data relative to the optimal allocation. We further demonstrate that, while multiple Generative AI firms exist in the market, the optimal number is one. These findings support the case for government intervention in the AI industry.