Blog Post

In our previous article following our 2026 Agent Skills Build Day, we explored the emergence of physical world models as the foundation for robotic and embodied AI. However, a parallel and equally significant frontier is the development of social world models. These systems move beyond predicting the physical properties of a falling object to predicting the complex, often nonlinear behaviors of human beings within social systems. We define social agents as generative AI entities that do not merely respond to prompts but inhabit a persistent behavioral state, allowing them to mirror, predict, and prototype human interactions at scale. The promise of social simulation lies in its potential to serve as a digital social sandbox, where the consequences of a new policy, a product launch, or a social norm can be observed in a synthetic environment before being tested in the real world.

The technical baseline for this memo (and arguably the field) is established by Joon Sung Park’s dissertation, "Generative Agent Simulations of Human Behavior." Park’s work addressed the fundamental flaw in traditional agent-based modeling: the reliance on oversimplified, rule-based logic that fails to capture the idiosyncratic nature of human life. He introduced a generative agent architecture that equips LLMs with a memory stream, a reflection mechanism, and recursive planning. This allows an agent to observe its environment, store those observations, periodically synthesize them into higher-level goals, and then execute long-term plans that adapt to new information. In a landmark study of 1,052 U.S. adults, Park demonstrated that agents grounded in two-hour personal interviews could achieve a normalized accuracy of 1.0, meaning they were as consistent in representing an individual's attitudes and economic behaviors as the individual was with themselves. This breakthrough proved that social agents could move beyond demographic stereotypes to become high-fidelity simulacra of specific human lives.

Building on this academic foundation, Simile AI emerged from stealth in early 2026 with a $100 million raise led by Index Ventures. Co-founded by Park alongside Stanford professors Michael Bernstein and Percy Liang, Simile represents the commercialization of the “Behavioral Foundation Model." Their approach is distinct in its use of what they call digital twins for enterprise decision-making. Rather than querying a generic model about consumer behavior, Simile populates simulations with agents derived from thousands of real-world interviews and transaction logs. This methodology has already yielded striking real-world results, such as a case where Simile correctly predicted eight out of ten questions asked by analysts during a simulated corporate earnings call. Major firms like CVS Health and Telstra are now using these simulations to replace traditional focus groups, moving from retrospective polling to proactive, real-time behavioral forecasting.

Case Study: Roblox’s Social Data Flywheel

David Baszucki and Erik Cassel launched Roblox in 2006 with an explicit vision: not a game, but an entirely new category of human co-experience, built on the belief that the more accurately a platform could simulate the real world, the more utility it could provide. The initial product was a 3D physics sandbox, inspired by their prior educational simulation software. What emerged over the next two decades was something considerably stranger and more interesting: the largest unintentional social laboratory in history.

The social evolution of Roblox unfolded in stages. Through the 2010s, the platform was primarily a game creation and distribution layer—users made experiences, other users played them. Then the pandemic created a qualitative shift: Roblox introduced "party place" spaces for virtual meetups and concerts, and Baszucki articulated a 30-year vision for users to have fully-fledged digital identities within the platform—a vision he formalized in the 2021 S-1 as the concept of "human co-experience."

By 2024, the platform was processing over six billion chat messages per day across 85 million daily active users, the majority of them under 16. Every avatar movement, object interaction, physics-based collision, chat message, and virtual transaction became a data point—the largest and most unique training dataset for interactive AI in the world. This was not passive observational data, but rather action-labeled, causally annotated, and timestamped—precisely the format that makes it valuable for training agents that need to learn from consequences, not just appearances.

The pivot to explicit social and agent infrastructure began accelerating around 2022-2023. Roblox shipped voice chat with age verification, then facial animation via real-world motion tracking. At RDC 2025, the company announced Text-to-Speech and Speech-to-Text APIs enabling NPC dynamic conversation, real-time voice chat translation across English, Spanish, French, and German, and first-of-their-kind AI capabilities including generation of fully functional 4D objects. The declared ambition was to support 100,000 simultaneous users on a single server with photorealism and imperceptible latency—a technical specification that is also a social architecture specification.

The social dimension here is underappreciated by analysts focused on Roblox's AI content creation story. What Roblox is actually building is an environment where agents and humans coexist and interact, and where the emergent social dynamics of that coexistence generate the training data for better agents. A meaningful NPC in a Roblox experience isn't just a game feature—it's a social participant that affects how players behave, what relationships they form, and what content they create. The deeper bet, which Baszucki has gestured toward without fully articulating, is that platforms optimized for human co-experience are the natural substrate for social agent deployment. Roblox's AI-driven discovery engine steers users toward social experiences, reinforcing the platform's core identity. When agents populate those experiences, the social signals that users generate in response—how long they stay, who they talk to, what they build—become the feedback loop that trains future agents. This is a compounding dynamic with no analog in enterprise software.

Governance, Business, and Individual Utility

One of the most exciting and broadly beneficial applications to social simulation is in governance. In the traditional policy lifecycle, decision-makers often rely on retrospective data—census reports or polls that are months old—to guess how a population will react to a new regulation. Simulation shifts this into a proactive, high-fidelity environment. A primary example of this is the Simile-Gallup partnership, which, by early 2026, has begun to offer synthetic national panels. Instead of waiting weeks for a 1,000-person survey to clear, a policymaker can query a representative agent bank to see how (for example) different tax credit structures might affect the spending habits of families of different socioeconomic backgrounds, and thus may allow governments to minimize otherwise unforeseen repercussions of policy decisions.

Simulation is also being used to model complex social stressors. The United Nations University (UNU) has pioneered the use of these agents to simulate planned relocation due to climate change. By populating a virtual environment with agents grounded in the cultural values and economic constraints of specific vulnerable communities, they can observe where social friction (such as land disputes or cultural erasure) is likely to occur years before the actual move. Similarly, researchers use social simulacra to stress-test platform moderation online—for instance, before launching a new subreddit or community rule, moderators can run a 48-hour simulation to see if the rule effectively curbs toxicity or if it unintentionally stifles productive debate in a sandbox environment rather than on real users.

The most immediately profitable application is in the commercial space. The multi-billion-dollar global insights industry faces disruption as AI-powered alternatives deliver results in hours versus weeks at a fraction of the $5,000–$20,000 per-session cost of traditional focus groups. CVS Health is already using Simile to test product placement decisions across simulated populations before committing to physical shelf changes at scale. In addition, there is a democratization aspect that follows the enterprise story: the same technology that allows CVS to test product placement can allow a ten-person startup to test messaging, a nonprofit to test community programs, or a solo creator to test content.

The Individual Utility: Social Rehearsal and Life Design

While policy applications focus on the macro-scale, the utility for individuals lies in asymmetric social preparation. We call this "social rehearsal"—the ability to use a digital twin as a low-stakes training ground for high-stakes human interactions. For example:

Social skills development and rehearsal. Research from CHI 2025 demonstrated that social simulation—including LLM-powered environments—can effectively serve as rehearsal spaces for individuals experiencing social anxiety, allowing them to practice encounters and develop coping strategies in a safe, controllable environment. The exposure therapy application is well-established in clinical settings; what generative agents enable is scaling that application outside of clinical settings and into daily life. Someone preparing for a difficult conversation with a manager, a first date, or a family confrontation could rehearse with a simulated counterpart calibrated to that specific relationship dynamic. This is not companionship—it's preparation. The distinction matters for both the product design and the ethical framing.
Organizational and team dynamics. Companies running strategic planning exercises, conflict resolution processes, or scenario analysis already spend significant resources on human-facilitated simulations. Agent-based simulations could run these at higher speed, larger scale, and with more controllable variables. A corporate team preparing for a merger integration, a school designing a new discipline policy, a neighborhood association debating a zoning change—all of these involve the same fundamental question: given this population and these rules, what happens?

Current Roadblocks

Despite the technical promise, a significant countermovement of legal and ethical scholars warns that we are entering a plausibility trap. They argue that because these agents are powered by LLMs, they are prone to a statistical collapse where agents drift toward a generic, polite mean, erasing the edge cases and idiosyncratic irrationality that define true human behavior. Critics suggest that while a simulation might look human, it lacks the internal causal motivation necessary to predict behavior in unprecedented crises. If a simulation is used to justify a high-stakes policy and fails because it couldn't account for genuine human rage or despair, the resulting simulation-driven error could have catastrophic real-world consequences.

This leads to the most complex frontier: the legal status of the digital twin. Legal theorists have begun championing the data dominion model, arguing that a high-fidelity AI replica is not a corporate asset but an extension of the natural person’s identity. Current laws are ill-equipped for this; for example, in the U.S., health data is protected by HIPAA, but once that data is used to train an agent, the resulting model is often treated as the proprietary intellectual property of the company.

The data dominion movement advocates for a new social contract where individuals have an “inalienable right to likeness," meaning they can revoke consent for their digital twin at any time, forcing the company to untrain or delete the specific behavioral weights associated with them. As we move toward 2027, the friction between corporate IP law and individual identity rights will likely lead to landmark litigation—specifically around whether an unauthorized simulation of a person constitutes a new form of identity extraction or even digital colonialism when used to profit from their behavioral patterns without compensation.

Investment Hypotheses

The transition from text-based chatbots to persistent social agents creates several high-conviction investment opportunities. We believe the following four segments will define the next phase of this market:

The Behavioral Grounding Infrastructure: As the value of social simulation rises, platforms will begin to collect and verify the high-fidelity human data required to ground agents. This includes startups building longitudinal interview pipelines and privacy-preserving ways to turn personal histories into agentic memories. Companies that can provide certified representative populations for certain use cases and domains.
Cognitive Continuity and Persistence Layers: There is a massive opportunity for middleware that solves the persona drift problem. This involves developing sophisticated memory architectures that can maintain an agent’s unique identity, goals, and relationship history over months of simulated time. These "identity engines" will be essential for any platform, from gaming to enterprise market research, that requires long-term behavioral consistency.
Action-Labeled Social Repositories: Just as OpenAI reportedly sought to acquire gaming clip platforms for their physical data, we expect a race to acquire large repositories of action-labeled social interaction data. Platforms like Roblox, Discord, or niche multiplayer environments are sitting on the training data necessary to teach AI the unspoken rules of human coordination. Investing in companies that control these social trajectories provides a durable moat against generic model providers.
Evaluation and Benchmarking for Social Realism: There is currently no generally accepted test for comparing and validating different social simulations. We believe there is a significant opening for firms that develop rigorous benchmarking standards to evaluate how well a simulation mirrors ground-truth social science. This includes frameworks for detecting bias, measuring emergent social phenomena, and verifying the causal accuracy of agent decisions, which will be the prerequisite for using simulations in high-stakes governance and corporate strategy.

Agent Skills Build Day: Social Agents Post-Event Memo

Case Study: Roblox’s Social Data Flywheel

Governance, Business, and Individual Utility

The Individual Utility: Social Rehearsal and Life Design

Current Roadblocks

Investment Hypotheses

Internet of Agents Build Day: Pre-Event Guide

World Models, Agents, and the Path to AGI