Should AI agents scrape or integrate external data?
My
latest on InfoWorld explores the upsides and caveats of both approaches.
To scrape or integrate? It's an age-old question resurfacing for AI agent builders.
Excited to share my analysis today for
InfoWorld, where I break down when it makes sense to scrape public web sources, and when official API integrations are the better choice for external data.
The takeaway: agents need data. New interactive browser tools and scraping techniques help pull in real-time, supplementary signals. But scraping comes with fragility and legal downsides. As
Deepak Singh puts it, "It's building on quicksand."
Scraping is no substitute for the predictable, validated, and governed integrations agents need to execute auditable workflows and real-world actions reliably.
This article features, in order of appearance:
-
Or Lenchner, CEO,
Bright Data
-
Deepak
Singh, CEO and co-founder,
AvairAI Inc.
-
Neeraj Abhyankar, VP, Data and AI,
R Systems
-
Gaurav Pathak, VP of AI and metadata,
Informatica
-
Keith Pijanowski, AI and ML solutions engineer,
MinIO
-
Krishna Subramanian, co-founder and COO,
Komprise
Also shout-outs to reports from
PwC (2025 AI Agents Survey),
Tray.ai (2024 Enterprise Survey),
Salt Security (2025 AI Agents Report), and
McKinsey & Company (2025 State of AI Study), plus links to reporting from
AI21 Labs,
The Register, and
WIRED.











