em360tech image

In 2022, my team and I worked with a company that approached us for what they thought would be a straightforward data deduplication project. They had a CRM database of 250,000 records and needed it cleaned before migrating to a new system.

It sounded simple enough. But once we began profiling the data, it quickly became clear that this wasn’t a one-week fix. It was a crisis.

Over 60,000 records were found to be duplicated, incomplete, or completely unusable.

The findings startled the CEO. Marketing, sales, and a one-person IT team were called in to address the situation. Needless to say, it was not a pretty sight to witness.

What our team discovered:

✅ Data entry was unregulated. Web forms, vendor imports, partner programs, event registrations, were all dumped into the CRM without any clear rules and validation.

✅ By the time we ran the first audit, we had already identified half of the data had missing values or entries (invalid phone numbers, email addresses, etc), rendering it unusable.

✅ None of the teams had a consolidated view of the data as they all operated in silos. Sales used its own system, while marketing and IT operated on CSV files. To each their own was the order of the day.

✅ The leadership team, oblivious to these issues, continued to rely on insights and reports built on faulty data.

A project that should have taken a week ended up lasting three months — not because the work was complex, but because no one had ever taken a step back to ask: who owns this data, and how should it be managed?

And that company wasn’t the exception.

It’s Not Just One Company — It’s 75% of Them. 
 

The data crisis we encountered in that project wasn’t unique. 

According to multiple industry reports, somewhere between 70% to 85% of companies are actively struggling with data quality. 

And in our own internal data quality report this year — based on over 100 conversations with prospects — we found that 75% of companies are facing serious challenges with data quality.

Some of these challenges included conflicting and duplicate records, messy and inconsistent data, disparate records, and just a general disconnect between teams, processes, and tools.

This begs the question – why?

Why is it that companies are still struggling to get a grip on data quality challenges?

In my experience, it boils down to three key reasons. Let’s look at them in detail.

1). Treating data as a “technical” concern

A common pattern we observe across various industries is the tendency to view data management as solely the responsibility of the IT department. Business teams are being pushed to be data-driven, but are unaware of data quality concepts; to no fault of their own, but when projects fail, they are the first in line to face blunt force!

So long as we continue to treat data as a technical concern, we will operate in isolation, thereby increasing our risks of disconnected views of the truth. When marketing, sales, or ops teams have different views of customer data, they will make incoherent strategies that will continue undermining business progress.

We must admit that business users are the ones using and interpreting data—IT professionals are there to “maintain” data infrastructures, not to derive meaning from records. However, this division between IT and business teams causes unnecessary friction, as the latter always has to rely on the former to transform, analyze, and interpret data.

Priyanka Jain, speaking with MIT Sloan, rightly says, “Everybody needs data literacy because data is everywhere. It’s the new currency, it's the language of the business. We need to be able to speak that.”

2). Prioritizing data quality only when there’s a setback

Another significant issue is the reactive approach many organizations take toward data quality. Companies know they likely have challenges with the nature of their data, but because there are no immediate setbacks, they tend to keep it on the back burner – until a crisis happens.

A failed migration project. A disgruntled customer who threatens legal consequences because of misplaced documents. A penalty charge for missing regulatory compliance. These are some of the many circumstances that prompt companies to look deeper.

Our advice: do not wait for a problem to happen. Adopting a proactive approach to data quality where you set up regular audits, improve data entry and acquisition processes, and maintain a process of cleaning and deduping data after every major event where you are likely to acquire massive contact data is a far more effective practice than letting bad data pile on for years.

To move forward, companies need to shift from a reactive to a proactive approach, integrating data quality measures into their regular workflows and planning processes – all while ensuring teams are aligned.

3). Investing in tools instead of people and processes

Veda Bawo, Director of Data Governance at Raymond James, aptly noted, “You can have all of the fancy tools, but if [your] data quality is not good, you're nowhere.”

While investing in advanced data tools can be beneficial, relying solely on technology without addressing the underlying processes and people involved often leads to suboptimal results.​

We've seen organizations implement sophisticated data platforms, only to find them underutilized or misconfigured because the necessary processes and training were not in place.​

A tool can identify duplicates or inconsistencies, but it cannot resolve departmental misalignments or clarify data ownership.​ It cannot bring people together and it cannot resolve root causes of poor data (such as manual entries, webforms without governance, or multiple data sources that aren’t mapped to the original source).

Success in data quality initiatives often comes from investing in understanding data flows, defining transparent processes, assigning ownership, and ensuring that staff are equipped and empowered to maintain data integrity.

So What’s the Fix? Start by Democratizing Data Quality!

Well, I wish there was a shortcut. I could easily tell you to invest more in systems and tools to fix organizational data quality challenges.

But that is far from the truth.

Because data quality is more than just fixing bad rows of data. It’s ensuring that the same problems don’t happen on repeat. You don’t want to have to dedupe 60,000 records year after year. So how do you ensure these issues don’t keep happening?

You first start by democratizing data quality.

Meaning: Creating visibility, accountability, and flexibility so the people closest to the data can help maintain it. I must note, though, that this doesn’t mean your business users need to suddenly start learning SQL or Python to clean their data. No, because you still have to follow processes, and data accessibility is still a matter of great concern.

What you need is more data literacy so that when the right people handle data, they can identify its usability. They must be empowered enough to know the difference between bad data and usable data.

Data democratization, though, is just one part of the journey.

The second is…

Aligning people with purpose

Data quality cannot improve in a vacuum. Even if you empower people to spot bad data, it won’t matter if they don’t understand why it matters or how their role contributes to the bigger picture.

In many companies, different teams handle data with different assumptions. Marketing might optimize for volume. Sales might prioritize speed. Operations might just want consistency. And without a shared understanding of what “good data” looks like across departments, everyone’s working toward a different goal — often unknowingly creating friction and inconsistencies.

That’s where alignment comes in. Data quality has to be more than an abstract value — it has to be tied to team KPIs, project goals, and customer outcomes. Everyone needs to understand how their input affects downstream processes. Leadership has to actively support this by making data quality part of onboarding, team rituals, and project planning and building a shared commitment — so that when someone flags a data issue, it’s not seen as a nuisance but as part of doing the job well.

The third is leadership setting the structure.

In most companies, leadership teams are oblivious to DQM challenges. Unless it’s a CEO running a one-man show, most leaders aren’t really aware of the implications of poor data. Sure, they are involved in discussions on how to use data for better insights, but when it comes to addressing the reliability of the data, they best leave it for tech teams to handle.

And that’s okay. Leaders don’t have to know about record linkage or golden records. But they do need to define ownership, set expectations, prevent inter-departmental conflicts, and fund the time and resources required to maintain quality. That includes deciding who is accountable for which data sets, how often reviews happen, and what happens when issues are found.

Without this top-down structure, even the most data-literate teams will fall back into reactive habits. No one will feel responsible, and no one will have the time to fix things that aren’t "urgent."

But when leaders actively support data quality as part of operational excellence — not just IT hygiene — it creates space for teams to be proactive. It removes the ambiguity, provides guardrails, and gives data the weight it deserves in decision-making.

And that’s really the point. 

Data quality doesn’t improve with more dashboards or clever automation. 

It improves when the people closest to the data are supported by clear priorities, strong alignment, and leadership that treats it as a strategic asset.