/

DATA QUALITY

When “Good Data” Isn’t Actually Good

Rethinking Data Quality in the Age of Bots, Click Farms, and AI Respondents

The Challenge

For decades, the insights industry has treated data quality as a filtering problem: identify bad respondents and remove them.

Speeder checks. Straightlining detection. Logic flags. Sense checks.

Traditional quality checks focus on one dimension: Is the data good or bad?

Today’s research environment is fundamentally different.

Fraud rings, click farms, identity recycling, and AI-generated responses have dramatically changed the nature of survey fraud. Respondents can now generate coherent answers, mimic subject-matter expertise, and pass traditional screening logic with ease.

In some cases, a respondent thousands of miles away can convincingly impersonate a specialized professional, complete with accurate terminology and plausible answers.

Historical checks:
  • Rigorous / masked screening 

  • Engaging question formats 

  • Embedded logic flags  

  • Speeder & Sense checks 

What this misses:
  • AI-generated open-end responses 

  • Bots & agents that respond like humans  

  • People taking surveys from the wrong country 

With changing contexts, a New Model of Trustworthy Data is critical

KADRO Perspective
3 Key Dimensions for Trustworthy Data

We’re constantly thinking about data quality as a multi-dimensional trust problem rather than a simple response validation exercise.

A more resilient framework focuses on three distinct questions.

WHO

Is this a real person?

  • Device fingerprinting 

  • Network anomaly detection 

  • Keystroke rhythm and typing behavior analysis 

  • Detection of mass account creation patterns 

WHERE

Are they where they say they are?

  • VPN and proxy detection 

  • Network analysis to identify masked connections 

  • Location verification signals 

WHAT

Are they actually a good respondent?

  • Speeding and straightlining detection 

  • Open-end analysis and AI-generated text detection 

  • Question-level time scoring 

  • Logical consistency checks across responses 

Together, these three dimensions create a more complete view of respondent integrity.

Rather than asking only whether the answers look reasonable, researchers can evaluate whether the entire participation context is trustworthy.

The Real Risk – it isn’t Bad Data. It’s Bad Decisions.

Marketing plans, product bets, and investments are increasingly being shaped by bots, click farms, recycled identities, or real people who are checked out.

If the next big bet or investment is misplaced because of bots and bad actors, the downstream consequences will be significant.

They also risk eroding confidence in the validity of insights.

What this means

KADRO’s 3 Key Dimensions for Trustworthy Data isn’t a static set of steps, it’s a framework to be curious and rigorous. It’s about insight reliability and reducing time spent on manually cleaning datasets.

Implementing advanced detection systems can mitigate risk and reclaim time, which can be spent on more strategic work.

  • As AI tools can streamline and simplify research flows, it has also made fraud easier

  • The future of insights will depend on treating data quality as a foundational element
    of credibility

  • Leveraging these mindsets alongside these capabilities today can rebuild trust and deliver better decisions