Data quality in research. Many are talking about it. “Yes, we need data quality in research.” That’s easy to say but harder to implement in practice.
“Consider the additional time and effort required for data cleaning,” said Karine Pepin, consumer insights executive and self-proclaimed data fairy. “Opt for a higher Cost Per Interview (CPI) upfront if it ensures better sample quality and saves valuable time later.”
Amen to that. Data quality in research is often stated as important – which it is – until the pressures to get everything done now and as cheaply as possible hit that statement like a dump truck smashing into a small passenger car. But, all metaphors aside here, ensuring you’re collecting good-quality data in your survey research is about instilling confidence in the insights and decisions made as a result. Let’s swerve around that dump truck and look at how to stay on the road to good data.
“People do focus on the issue; however, it’s usually everyone’s ‘second job,'” said Vignesh Krishnan, CEO and Founder of Research Defender.
Very true. And since data quality is our first job, let’s discuss these topics in this article:
- What is data quality in research?
- Why data quality is so important
- What are some data quality metrics
- Approaches for ensuring good quality data
What is data quality in research?
At the most basic level, data quality in research means that you’ve collected useful data from the right target audience.
“To me, great data is when the story of each participant is cohesive,” said Karine. “Participants who cheat in surveys typically provide random answers, leading to incoherent results. While we talk a lot about quality controls in place pre-survey, in-survey, and post-survey, we’re not focusing enough on the macro issues: We need more sample sources that can validate the identity of the participants.”
No borders. Contact us to access our international consumer panel network!
So, what is data quality?
It’s the right respondents for a study. This means qualified participants who have agreed to participate in the research study and are engaged in the process. Responses from the wrong people are a big data quality issue and can sink your decision-making. That includes people trying to game the system, either individually or as an organized group, and looking to gain survey rewards without legitimately taking a survey. Some respondents mean well but may not fully engage with a survey while taking it, and their responses may reflect that they have not read or understood some questions.
We find that the chances of cheating or just not paying attention are much more common nowadays because most studies are self-completion surveys conducted online. A respondent who is sitting in front of somebody during a research interview faces a level of accountability that is much greater than someone answering a survey on their phone while they may have several distractions, e.g., watching TV, texting with friends, etc.
A quality data set would not only include respondents who are qualified and engaged in the study, but it will also be clear of bad responses. This would be done through a combination of technology on the front end and human review on the back end.
Why data quality is so important
At the end of the day, the point of any research isn’t just to check a box. Oh yes, we did research. Any research needs to be about getting actionable insights that can help you make the right business decisions moving forward.
Long story short: Data quality is essential to get research results that can help you gain insights to make decisions.
What are some data quality metrics?
These can be grouped into a few important metrics as they relate to respondent answers, including:
- Accuracy
- Completeness
- Consistency
- Validity
- Uniqueness
Good data collection agencies use a mix of technology oversight and human validation to flag people before they ever start participating in a study, during the study and also afterward.
Approaches for ensuring quality data
Several approaches are important to consider to ensure quality data:
Choosing the right sample sources
Choosing the wrong sample source can lead to data quality issues since the respondents won’t be relevant to the study.
Use red herrings
Add some questions or answer choices that will help to determine if respondents are not paying attention. This could be as simple as adding some brands that don’t exist in a brand awareness question or asking a question where all respondents are asked to select a pre-designated answer.
Speed checks
When somebody breezes through a survey, that can indicate they aren’t actually answering the questions correctly because they didn’t really take the time to comprehend them. If you set limits on the minimum amount of time to answer a survey, or better yet, a section of a survey, then you can flag them as speeders.
Identifying straightliners
A sure way to tell if a respondent is breezing through a survey is to see if they provide the same rating for every option.
Open end review
When answers don’t make sense in open-ends, that’s also a concern.
The use of technology
Use technology – like Research Defender, for example – to help catch fraud and other issues that can impact the quality of the data.
Human quality assurance
While much quality assurance can be done with software, nothing can replace the human touch. By putting some eyes on the data, you may be able to identify discrepancies that may not have been flagged by tech.
Conclusion
Data quality in research can be a differentiator for your business – especially if everyone else talks about the importance, but you actually get high-quality data.
When people ask me how to ensure data quality in research, it certainly starts with the awareness of its importance, but then it needs to be implemented correctly.
Higher quality data often comes at a higher cost, but consider the cost of making poor decisions based on poor quality data. In the long run, saving a few bucks on cheaper data can cost you.