DATA QUALITY

When “Good Data” Isn’t Actually Good

Rethinking Data Quality in the Age of Bots, Click Farms, and AI Respondents

The Challenge

For decades, the insights industry has treated data quality as a filtering problem: identify bad respondents and remove them.

Speeder checks. Straightlining detection. Logic flags. Sense checks.

Traditional quality checks focus on one dimension: Is the data good or bad?

Today’s research environment is fundamentally different.

Fraud rings, click farms, identity recycling, and AI-generated responses have dramatically changed the nature of survey fraud. Respondents can now generate coherent answers, mimic subject-matter expertise, and pass traditional screening logic with ease.

In some cases, a respondent thousands of miles away can convincingly impersonate a specialized professional, complete with accurate terminology and plausible answers.

Historical checks:

Rigorous / masked screening
Engaging question formats
Embedded logic flags
Speeder & Sense checks

What this misses:

AI-generated open-end responses
Bots & agents that respond like humans
People taking surveys from the wrong country

With changing contexts, a New Model of Trustworthy Data is critical

KADRO Perspective
3 Key Dimensions for Trustworthy Data

We’re constantly thinking about data quality as a multi-dimensional trust problem rather than a simple response validation exercise.

A more resilient framework focuses on three distinct questions.

WHO

Is this a real person?

Device fingerprinting
Network anomaly detection
Keystroke rhythm and typing behavior analysis
Detection of mass account creation patterns

WHERE

Are they where they say they are?

VPN and proxy detection
Network analysis to identify masked connections
Location verification signals

WHAT

Are they actually a good respondent?

Speeding and straightlining detection
Open-end analysis and AI-generated text detection
Question-level time scoring
Logical consistency checks across responses

Together, these three dimensions create a more complete view of respondent integrity.

Rather than asking only whether the answers look reasonable, researchers can evaluate whether the entire participation context is trustworthy.

The Real Risk – it isn’t Bad Data. It’s Bad Decisions.

Marketing plans, product bets, and investments are increasingly being shaped by bots, click farms, recycled identities, or real people who are checked out.

If the next big bet or investment is misplaced because of bots and bad actors, the downstream consequences will be significant.

They also risk eroding confidence in the validity of insights.

What this means

KADRO’s 3 Key Dimensions for Trustworthy Data isn’t a static set of steps, it’s a framework to be curious and rigorous. It’s about insight reliability and reducing time spent on manually cleaning datasets.

Implementing advanced detection systems can mitigate risk and reclaim time, which can be spent on more strategic work.

As AI tools can streamline and simplify research flows, it has also made fraud easier
The future of insights will depend on treating data quality as a foundational element
of credibility
Leveraging these mindsets alongside these capabilities today can rebuild trust and deliver better decisions