4 Steps to Standardize Your Data and Get Better Insights

4 Steps to Standardize Your Data and Get Better Insights

The line between a lot of data and too much data is razor-thin. All marketers and salespeople want the former, but achieving the latter is all too easy if we don’t use data standards.

Standards make data uniform. They drive information into a database and record it there in a consistent, predictable, and homogenous way. This common format enables “collaborative research, large-scale analytics, and sharing of sophisticated tools and methodologies” (OHDSI). With more consistent data, it becomes easier, faster, and more cost-efficient to analyze data, find trends, target, nurture, and convert.

So, have some standards. Here are the four steps you can follow to achieve data standardization:

  1. Conduct a data source audit
  2. Brainstorm standards
  3. Standardize data sources
  4. Standardize the database

Step 1: Conduct a Data Source Audit

Start by pinpointing all the data sources used by your business. A data source is a data supply point from which information flows into your database. At most companies, the sales and marketing teams build, mine, and maintain multiple sources, pulling a wide range of data to drive leads through the funnel. But other teams may also own data sources, so you must collaborate with stakeholders on as many teams as possible. Break those silos. Be transparent. Communicate the goals and benefits of data standards as you loop in decision makers.

The information you gather on your data sources should be exhaustive. This is because data standards are only as effective as how well they account for all data realities. In order to build valid, enforceable standards (next step), you must know: each kind of source that supplies data, how often each source supplies data, which teams own each source, which teams use each source (or want to), and if the source is first-, second-, or third-party. Keep in mind that it may not be possible to standardize or otherwise alter data from a third-party source.

Step 2: Brainstorm Standards

To quote the cool kid from every ‘80s high school movie, there’s only one rule when it comes to brainstorming standards: “There are no rules!” All companies have their own data needs, goals, and sources, so it’s hard to give universal advice. However, there is one fact we should stress to data-driven marketers who are crafting standards: A positive proportional relationship exists between how big data is and how precise the standards must be that control it.

Big data has three Vs: volume, velocity, and variety. In other words, data is big when there is a lot of it, it comes in fast, and it lives as various types. Data standards should be specific and all-encompassing enough to leave no data behind in your business systems; they should hold up even if there is a flood of rapid, disparate information. For this reason standards must also be forward-thinking. A startup might not have big data at the moment, but what if it becomes a successful Fortune 1000 brand in a few years? It shouldn’t have to reinvent the data wheel. So, standards should be able to organize data both as it currently enters your database and as it might enter your database in the future.

Step 3: Standardize Data Sources

Time to do the darn thing and standardize. It’s okay to start externally with the sources that feed your database or internally with the data you currently own. This step is for the former: modifying supply points so they give you data in the format you and stakeholders have chosen. The next step will deal with the latter.

Data sources link to online customer behavior. Three of the most popular forms of engagement are email opens, ad clicks, and form fills. Actions like email opens and ad clicks are likely to come to your database with some level of standardization already applied, courtesy of your marketing software. But form fills can be trickier. While forms provide vital detail on leads, they can wreak havoc on a database with blank text fields. Enter data standards, stage left.


Let’s suppose the marketing team at a research brand has determined it would be useful to know the highest level of education for users who download reports. So, the team builds a landing page for its newest whitepaper and gates the asset with a form, asking visitors for name, email address, and education level in free-form text fields. This follows best practices. The form is short, the offer provides value, and it helps the team start populating the education field in its data. Not to mention, blank text fields are great for qualitative research.


But imagine that 400 people visit the landing page who have master’s degrees. How would these users input that fact? Unlike [FirstName LastName] for the name field and [first.last@domain.com] for email address, education level has no intuitive format. Fifty users could enter the word “Master’s” with an apostrophe, and another 50 could enter it without one. Fifty more users could type “M.A.” with two periods; another 50, “MA” with no periods. Same for “M.S.” and “MS.” Fifty more visitors could reveal they have a “Master’s in [Specific Field],” while another 50 say, “Master’s from [Specific School].”

These 400 replies were entered in different ways even though they fundamentally mean the same thing. These users all have a master’s degree. That’s a problem. It’s hard for analysts to review these inputs and find trends with it if essentially identical data is entered with inconsistent formatting. In this example, the only solution is Herculean. The data team will have to manually inspect all responses, which likely number in the thousands, given the problem almost certainly occurred at all other educational levels, too.


This is why standardization is often achieved by replacing blank text fields with dropdown menus. This forces data consistency; responses can’t help but flow into the database with formatting preset by the brand. Our example business could use a dropdown menu with an option such as “Master’s Degree” or “Graduate School,” if it wanted to keep things high-level. Or, it could make separate options for “M.A.” and “M.S.” if it wanted to be more detailed. Going the latter route, it could ask its data team on the backend to build a general “Master’s” list that filters between “Arts” and “Science” leads as needed—which brings us to the next (and last) step.

Step 4: Standardize the Database

What use are great data standards that only apply to new, incoming data? That’s only half the equation. It’s best practice to also enforce standards on data already collected, so all the data is in the same format. This requires a big investment of your data team’s time and skill, but the payoff can be huge. By retrofitting data, you can ensure that different people on different teams will interpret data in the same way. Across the company, you can help everyone have access to same depth and quality of data for their projects.

Filters are invaluable. Data filtering lets users refine data sets to include only the data they need for a specific task or campaign, and to exclude “data that can be repetitive, irrelevant, or even sensitive” (Techopedia). This is useful when standardizing, because making data uniform and consistent does not mean deleting the details from some data sets just because they’re not present in all data sets. The goal instead is to identify these details and make it possible to search and organize data sets by them.

Sticking with the academic example, we saw that people gave various levels of detail about their degree. Of the 400 respondents with a master’s, some said what type they received—Arts or Science—others said what school it came from, and others said what field it covered. Our example brand could ask its data team to build filters that make this data pull-apart-able.

A generic or universal Master’s list would be necessary, to show all the contacts with that degree no matter type, field, or school. But by filtering this generic list, the brand could still see these complementary details when it wanted. The data team would just have to ensure these details were entered consistently (i.e., met standards). Would it be “NYU” or “New York University” inside the database? “Politics” or “Poli Sci” or “Political Science”? Should “American History” and “Medieval History” be distinct items, or subsumed into “History”? Interesting judgment calls abound.

Gold Standard

This example has shown why it’s important and beneficial for a company to develop standards for the data it uses. Better input means better output; the simpler and more controlled the information you receive, the faster you can glean insights that lead to conversions and a better bottom line.

Remember: Account for all of your data sources, involve a diverse range of stakeholders, standardize external sources (with a keen eye for open-ended forms), and use filters to organize the data you already own. It’s a wild, wide big data world out there, but with good data standards you can come out on top.

In today’s marketing environment, how are CMOs using data to be more effective? Download The Data-Driven CMO to learn what executives think are the top challenges and opportunities for data-driven marketing.



Write a comment