May 25, 2026, By Jason Kramer
Synthetic data is one of the most contested topics in market research right now. Some of what’s being marketed under that banner deserves the skepticism it’s getting. Boosting thin survey samples with model-generated responses, for example, is a shortcut that solves a sampling problem by quietly inventing the very signal you were trying to measure. Researchers are starting to push back on this, and with good reason.
But “synthetic data” is a broad, complicated category, and not everything underneath it is the same thing. There’s one application I think will outlast the hype cycle and become a permanent fixture in market research: digital twins of real customers. Not generic AI personas. Not LLM-generated archetypes. Digital twins built from real, identified humans who agreed to participate.
Read this post as a wakeup call: market researchers are ideally placed to lead this revolution. But if we don’t, other disciplines will, eroding our standing.
What is a digital twin, exactly?
The term originated in engineering. NASA used early versions in the 1960s to mirror physical spacecraft on the ground so engineers could troubleshoot Apollo 13 from a console in Houston. The modern industrial version is everywhere now: Formula One teams run thousands of pit-stop simulations on digital twins of their cars before a single tire hits the track. Manufacturers run digital twins of factories and refineries to test changes before they’re built. The pattern is the same in every case: a high-fidelity virtual replica of something real, kept in sync with that real thing, that you can interrogate and experiment on without disturbing the original.
In market research, a digital twin is a virtual replica of an individual human respondent, built from that person’s own words, behaviors, decision history, and preferences. Accessible on demand. Capable of responding to new questions in a way that reflects how that specific person would actually think, not how an “average” version of their demographic would.
That last distinction is the entire ballgame. As Hadley Edwards, Head of Market Research & Insights at Google Cloud Marketing, put it to me:
“Typical AI personas represent an average, ironing out strong opinions and regressing to the mean, which removes the opportunity to truly understand variance, polarization, and niche appeal. This is particularly troublesome in the B2B space, where individual decision-makers can have an outsize impact on business outcomes. Digital twins circumvent the tyranny of averages. This emerging method harnesses the power of synthetic data while ensuring the synthetic self represents a real human whose perspectives, opinions, etc. are unique and dynamic.”
The “tyranny of averages” framing is exactly right. A generic AI persona is a composite, smoothed toward the middle. A digital twin is a specific person. Why would you talk to a generic segment when you can talk with a dozen members of it, the way any trained market researcher would?
Why this matters for market research: it’s about time, not money
The easy story is that digital twins are cheaper. That’s true, but it’s not the actual reason they’re going to stick.
Global market research spending is roughly $140 billion a year. The budgets are there. The cost objection isn’t the main challenge for enterprise clients. What is real is the time problem.
Survey research takes weeks. Recruiting a qualitative sample takes weeks. Concept testing, brand tracking, segmentation. The rhythm of traditional research is measured in weeks at best, and often in months. Meanwhile, the rhythm of corporate decision-making is now in days. A head of marketing gets a question from the CEO on Monday and needs an answer by Wednesday. A product launch gets bumped up a quarter. A competitor moves and the response window closes by Friday.
So what happens? Decisions get made anyway. In The Economist Intelligence Unit’s Guts & Gigabytes global survey of 1,135 executives, 58% said they went with their gut on their last big decision. And these aren’t small calls: nearly one in three of those decisions was valued at $1 billion or more. We aren’t losing to bad research. We’re losing to no research.
This is where Colin Powell’s 40-70 rule comes in. Powell’s formula was P = 40 to 70, where P is the probability of being right, and the numbers are the percentage of information acquired. Less than 40%, you’re shooting from the hip. More than 70%, the moment has passed and someone else has made the call for you. The sweet spot is somewhere between informed enough to be useful and fast enough to still matter.
That puts digital twins in the right comparison set. The relevant question isn’t “is a digital twin as good as a 4-week survey?” The relevant question is “is a digital twin better than the gut-only decision the team was going to make on Friday?” When the alternative is zero data, a 70-85% confidence answer delivered in 15 minutes is a breakthrough.
Andreessen Horowitz partner Olivia Moore, in her piece Faster, Smarter, Cheaper: AI Is Reinventing Market Research, reported that many CMOs they’ve spoken with are comfortable with AI research outputs that are roughly 70% as accurate as traditional consulting firm work, given that the data is cheaper, faster, and continuously refreshable. The 70% threshold isn’t a hypothetical. It’s already where the buyer is.
This is a much bigger phenomenon than market research
This is not only a market research story. Digital twins of humans are showing up across the economy, and market research is one application among many.
- Salesforce launched eVerse, an enterprise digital twin platform from Salesforce AI Research. UCSF Health is piloting it to train AI billing agents in a simulated environment populated by digital twins of patients and callers. Early results: trained agents handle up to 88% of cases.
- Delphi, backed by Sequoia, lets creators, coaches, CEOs, and experts build digital clones of themselves so they can have 1-on-1 conversations with their audience at scale. Authors, podcasters, influencers, and even politicians have built them.
- Meta’s AI Studio, launched in 2024, lets Instagram creators build AI chatbot versions of themselves trained on their voice and style, with around 50 creators in the initial partner cohort.
- Memorial digital twins are an entire emerging category. Companies like HereAfter AI and StoryFile build conversational replicas of departed loved ones from interviews recorded while they were alive, so families can continue to “talk” to them.
These examples vary widely in quality and ethics. I mention them for one reason: digital versions of real people are already becoming a commercial category, not a sci-fi premise. It’s a category. And our market research slice of it is one of the better-formed and most defensible use cases, because the consent and the value exchange are built-in to how we operate as an industry. And if we don’t move to own it, there are plenty of other corporate functions who will.
Where digital twins shine in research: anywhere time is the obstacle
Two recent examples from our clients make this concrete.
The naming sprint. A major tech company had three days to come up with a name for a product about to launch. Senior leadership had rejected every previously developed option. The research team was starting from scratch: no time to recruit, no time to field, no time to even brief a vendor. Using a digital twin panel, they went from open question to ranked, defended results in roughly 15 minutes.
The campaign concept test. A top-5 employee benefits company’s research team was handed a major marketing campaign by the head of global marketing and given a matter of days to test it. Traditional concept testing would have missed the window. Digital twins let them prioritize concepts, identify blind spots, and improve the work before launch – all inside the timeline the marketing team actually had.
Neither of these displaces a traditional study where time and budget allow. They add a layer for moments when the alternative is no research at all.
How you build the twin matters
This is where the field gets messy, and where I think most of the legitimate skepticism lives.
We believe purpose-built digital twins – twins developed from scratch with the contributor, designed specifically to capture how that person makes decisions – are the highest-fidelity option available. Research from Stanford and Columbia points the same direction. When you build a twin from a substantive narrative interview, you get connective tissue: the stories, contradictions, decision logic, and category-specific behaviors that a model needs to respond as that person, not as a stereotype of that person.
By contrast, twins built from existing survey data are extremely limited, and in practice they’re closer to pure LLM-generated personas than to actual digital twins. Loading your brand tracker into a platform doesn’t give the model the deeper layer required: the needs, motivations, pain points, and category behavior that drive decisions. The data wasn’t designed for that purpose, so the resulting “twin” is mostly the LLM filling in gaps from its training data, which is exactly the failure mode researchers are right to worry about.
The shorthand: if the source data wasn’t built to capture how a human makes decisions, the twin isn’t going to be able to model how that human makes decisions.
How you access the twin matters, too
Researchers are rightly wary of black-box AI products. A lot of digital twin platforms today use a ChatGPT-style interface: you ask a question, you get an answer, and you have no idea what’s under the hood or where the answer came from. That’s not research-grade. That’s a magic eight ball with a nicer UI.
A more responsible approach treats the digital twin session like a qualitative read: you see the transcript, you see the citations, you see which contributor said what, and you can audit the chain from contributor narrative to model output. This is the approach taken by the start-up Focus (currently in closed beta) — a focus-group-style interface where the researcher reads the conversation, not just the summary.
Transparency and citations, on top of purpose-built twins, is the version of this technology that belongs in our toolkit. The black-box version is the version that gives the whole category a bad name.
Where this goes in 3 years
Here’s my prediction: within 3 years, every enterprise company will have an always-on layer of virtual customers they can query for on-demand insights. Not as a replacement for traditional research, but as a permanent operating layer on top of it. The way dashboards became permanent in the early 2010s, on-demand customer intelligence will become permanent in the late 2020s.
The question for us as market researchers isn’t whether this happens. It’s who gets to define what good looks like when it does.
If we don’t lead this, other parts of the organization will pre-empt us. Look at the Salesforce eVerse example again: that initiative is coming out of the sales and service org, but the use cases shade directly into territory that has historically been ours. Marketing, strategic planning, UX, and product all have their own AI initiatives now, and many of them touch customer simulation in some form. None of those functions have what we have: a discipline built around understanding when data is trustworthy, how to scope a research question, how to read variance, how to distinguish signal from artifact, and how to put findings in the context of established methods.
That’s the unique value we bring. Not the methods themselves, but the judgment about when and how to apply them. If we sit out the digital twin moment and wait for it to come to us, we’ll find that adjacent functions have built their own versions without that judgment baked in. Research teams should get involved now. Partner with sales, marketing, strategic planning, UX. Set the standards. Define what trustworthy customer simulation looks like, because that question is going to get answered with or without us.
This category will not wait for researchers to approve it. The useful question is how much of our discipline we build into it now.
Want to learn more about digital twins? Reach out to me here







