We ran 18 studies through a synthetic-respondent platform and the same studies through a traditional panel. The agreement rate, by question type, was not uniform, and the failure modes are predictable enough to act on.
Where synthetic panels agreed with the real panel within five points: brand awareness, category usage, demographic-stable preferences, and copy-test top-box scores on familiar categories. These are safe applications.
Where they diverged badly: anything emotional or identity-laden, anything in a new category the model has weak priors for, and any question with a socially desirable answer. The synthetic panel will confidently agree with whatever the question implies.
The recommendation is not 'never use them.' It's 'use them where the failure mode is acceptable.' Concept screening at the funnel top, where false positives cost a meeting and false negatives cost an idea, yes. Final go/no-go calls on a $2M campaign, no.
"Synthetic panels are a fast, cheap, and dangerously confident liar in exactly the categories you most want to ask about."
