Blog Post

A large language model is fluent the moment it finishes training and frozen in the same instant. What it knows is fixed at a cutoff, while the markets, prices, documents, and events it might be asked about keep moving. Much of the recent progress in agents has been less about larger models than about closing that gap — connecting capable reasoning to the world as it currently is. The obstacle is mundane and stubborn: the public web was not built to be read by machines, and reliable access to it at scale is a real engineering problem. AGI House and Bright Data are setting aside an evening to build at that edge.

Why this, why now

Large language models are trained on a fixed corpus and reach a cutoff after which they have no direct knowledge of the world. A substantial line of work has developed to connect them to information beyond that snapshot. RAG (Lewis et al., NeurIPS 2020) grounded generation in an external corpus retrieved at inference time. Reasoning-and-acting frameworks such as ReAct (Yao et al., 2022) and tool-use methods such as Toolformer (Schick et al., 2023) extended models to call external systems and fold the results back into their reasoning. The agentic framing that has since become standard, in which a model plans, calls tools, observes outcomes, and iterates, reframed retrieval from a lookup over a static index into interaction with external systems in a loop. MCP, introduced by Anthropic in late 2024, standardized the interface between models and those external tools and data sources.

As these systems have come to treat the open web as their primary source of current information, a second problem has become salient: the public web is largely adversarial to automated access. Anti-bot systems, CAPTCHAs, geo-restrictions, rate limiting, and JavaScript-heavy rendering all sit between an agent and the content it is trying to read. Connecting an agent to the live web is therefore as much an infrastructure problem as a modeling one, and it is the problem that web data platforms can address by maintaining the proxy networks, unblocking techniques, and structured-extraction pipelines that make reliable access possible at scale.

A related shift concerns how web data is consumed. It was historically collected for offline analysis, scraped and stored and queried later. The present moment is one in which web data is increasingly a live input to an inference happening in real time, under latency budgets measured in seconds or less. On this account, the freshness of retrieved information becomes a property of the system rather than an afterthought, and the cost of stale or incorrect data is borne directly in the agent's output.

The technical landscape

The tooling for agentic web access organizes into three categories:

Search: a query returns ranked, current results, with URLs, titles, and snippets, optionally targeted to a geography. This is the starting point when an agent does not yet know which pages it needs.
Fetch and extraction: given a known URL, an unlocker returns the page content as clean Markdown or JSON, handling anti-bot measures, CAPTCHA solving, and geo-restrictions so that the model receives usable text rather than raw or blocked HTML.
Browser automation: programmatic navigation, clicking, scrolling, and form interaction, used when the target content sits behind interaction or is rendered dynamically.

The appropriate choice follows from the task. Fresh results call for search; a known URL calls for the unlocker; structured data from a platform for which a scraper already exists calls for that prebuilt scraper; content behind interaction calls for the browser. When the decision is best left to the model, these capabilities can be exposed through MCP, which presents them as tools the model selects among and returns model-ready output by default. Bright Data offers this full surface, search, unlocker, prebuilt scrapers, browser, and pre-collected datasets, through a single MCP server, with a free tier of 5,000 requests per month.

One recent addition worth noting is Scraper Studio, which Bright Data introduced this year. It generates a scraper from a natural-language description of a target site and the fields to extract, runs it as a managed endpoint, and revises the extraction logic automatically when the site's structure changes. Conventional scrapers break when a site alters its markup, and keeping them running is a recurring maintenance cost; the managed, self-revising approach is aimed at reducing that cost and shortening the path from a target site to a usable data feed.

Where the interesting work is

Several areas of active work sit at the intersection of agents and live web data, and Bright Data's recent output is a useful indication of where practitioners are concentrating.

Continuous monitoring: Agents that maintain a current picture of a market, a competitor set, or a regulatory environment and respond when it changes. Early condition-triggered systems of this kind were narrow and hand-specified; capable models generalize them into agents that hold and update such a picture with less manual specification. Bright Data's Deep Lookup, a natural-language interface over large-scale web data aimed at enumerative "find all" questions, addresses the harder retrieval problems this entails.
Grounding and verifiability: Producing outputs that are not only fluent but correct and attributable remains an open problem, and it is substantially a retrieval and access problem, depending on current information, citations to sources, and re-verification before an agent commits to a claim.
The move from reading to acting: The frontier has extended from agents that summarize a page to agents that operate on the live web, comparing options, completing transactions, and acting against real prices and inventory.
Measurement of the AI-mediated web itself: A single query now returns different answers across Google's AI Overviews, Perplexity, and ChatGPT, varying by geography and language. Measuring how a topic, a source, or a brand is represented across these answer engines is a new and largely unstandardized problem.
Evaluation: Static benchmarks are prone to contamination once they enter training data, which has motivated interest in evaluation on fresh, live tasks that resist memorization. A related idea surfaced in a recent panel Bright Data hosted on robotics and world models: using a learned model of the world to evaluate candidate policies against the most informative real situations, analogous to how autonomous-driving systems are assessed against recorded long-tail encounters rather than live miles alone.

Project ideas

A set of project ideas scoped to the build window and runnable on the free tier. They vary in difficulty, and most are stronger when narrowed to a single concrete case.

A change-monitoring agent. Maintains a current snapshot of a set of products or competitors and surfaces changes, such as a price movement, a new listing, or an altered term of service. This is the simplest demonstrable form of an agent that holds a live model of some part of the world. Stack: SERP and Unlocker, or a prebuilt e-commerce scraper, through MCP; optionally a scheduled Scraper Studio endpoint.
A site-to-API exercise with Scraper Studio. Point Scraper Studio at a site of interest, specify the fields, and stand up a managed endpoint, then build a small application or analysis on top. This exercises Bright Data's most recent product directly, and the self-revising behavior is straightforward to demonstrate. Stack: Scraper Studio; a thin frontend or notebook.
A grounded question-answering agent. Searches, retrieves the most relevant sources as clean Markdown, and answers with inline citations and a re-verification pass over the cited material. This addresses the fluent-versus-correct problem directly. Stack: MCP search and Markdown extraction; optional orchestration with LangGraph or LlamaIndex.
An answer-engine visibility tracker. Given a topic or brand and a set of queries, compares how it is represented across Google AI Overviews, Perplexity, and ChatGPT by geography and language, and tracks change over time. This engages a new and underexplored measurement problem. Stack: SERP with geo parameters; extraction of the answer surfaces; a comparison step.
A browser-based task agent. Completes a multi-step task behind interaction, such as filtering, paginating, adding to a cart, or completing a form, and captures the result. This represents the acting-not-reading frontier and the widest gap between demonstration and production. Stack: MCP browser tools. Browser automation consumes more credit and is more failure-prone than search or extraction, so a single site and a single flow is the appropriate scope.
A contamination-resistant micro-evaluation. Constructs a small set of tasks from information that postdates common training cutoffs and compares a web-grounded agent against an unaided model. This is a compact way to study the effect of live access on correctness. Stack: SERP and Unlocker to construct the tasks; a scoring harness; an A/B comparison.
An enumerative research agent. Takes a single hard "find all" question and runs it end to end, through search, iterative retrieval, aggregation, and deduplication, mirroring the problem Deep Lookup is designed for. Stack: SERP with iterative Unlocker calls; an aggregation step.

What these projects share is a single move: connecting a capable model to current information and letting it act on what it finds. We look forward to seeing what comes of it. See you soon.

"The Web Wasn't Built for Agents" - A Primer for the Real-Time Agents Build Evening

Why this, why now

The technical landscape

Where the interesting work is

Project ideas

AGI Dinner Series: What’s Actually Underhyped

"The Web Wasn't Built for Agents" - A Primer for the Real-Time Agents Build Evening