Data literacy: What it is and why it matters

Data literacy: What it is and why it matters

Did you miss a session at the Data Summit? Watch On-Demand Here.



This article was contributed by Bill Schmarzo, the Dean of Big Data

What do you need to do to increase the data literacy of your organization?

In a world where your personal data (and the preferences and biases buried in that data) are used to influence your behaviors, beliefs, and decisions, data literacy is a fundamental and indispensable skill. And it’s not just corporations that need this training. Data Literacy should be taught in universities, in high schools, in middle schools and even in adult education and nursing homes.

Data literacy training needs to educate EVERYONE about how their personal data is being collected, analyzed, and used, so as to not be duped into actions, behaviors, and beliefs by organizations and people who understand how to manipulate your personal data to their benefit.

But what are the educational requirements that comprise a data literacy curriculum? To facilitate that data literacy education for my students, I’ve created the Data Literacy Education Framework that provides a holistic synopsis on the data and analytic training requirements – subject matter areas – for everyone to become data literate.

Figure 1: Data Literacy Education Framework

Let’s explore each of the Data Literacy Education Framework subject areas as part of a two-part series on Data Literacy. Then using this framework, we can construct an educational curriculum including testing to measure and subsequently raise the Data Literacy IQ of everyone (which I’ll have to cover in a future article…or maybe a future book).

1.  Data awareness

In “The Growing Importance of Data and AI Literacy – Part 1,” I discussed the importance of data awareness and how everyone needs to become aware of how organizations are gathering, analyzing, and using their personal data.

Data Awareness is understanding how organizations are capturing, analyzing, and using your personal data (e.g., demographic, commercial transactions, financial holdings, health and exercise, entertainment, political, and social data) to identify and codify personal behaviors and preferences that can be used to influence your actions, behaviors, and beliefs.

While most of us intrinsically know that organizations are capturing data about us, it’s the “invisible data” (or “obscured data”) that is buried in the fine print of that website or mobile app End-User Licensing Agreement that is most troubling.

Figure 2: Your Personal Data Captured by Third-party Data Aggregators

A case in point is Google, which exploits or “monetizes” your data in the following ways:

  • Google Ads.  Allows businesses to target their products online based upon your personal activity and interests. Google uses AI to create a profile of the customers’ behaviors and leverage the insights to target the right individual with the right ad.
  • Gmail. Google also integrated several AI and ML algorithms to enhance customer experience. One AI feature is the smart reply. Google AI analyzes the entire Gmail and proposes a reply.
  • Google Assistant. Based upon your requests, this voice assistant can learn your interests to search for anything – music playlists, restaurants, best beaches, or hotels – and make product and service recommendations based upon your interests.
  • Google Maps. Google Maps uses AI to track the driver’s route, estimate where they are headed and guides them to their destination. It offers recommendations based upon nearby restaurants, gas stations, etc. based upon your interests.
  • Google Photos. Google uses AI mines your photos to suggest images and videos that the users can share with their friends and families.

Many organizations, like Google, provide “value” in return for your personal data such as free email, free social media platforms, personalized web experiences, free online games, free navigational services, and product and service discounts (in the case of loyalty programs). It’s just that users need to be aware that there is a “price” for these “free” services, even if the price isn’t as obvious as a monthly subscription fee.

What can you do to protect yourself? This data literacy framework can help you find the answer. The first step is an awareness of where and how organizations are capturing and exploiting your personal data for their own monetization purposes. Be aware of what data you are sharing via the apps on your phone, the customer loyalty programs to which you belong, and your engagement data on websites and social media.  But even then, there will be questionable organizations that will skirt the privacy laws to capture more of your personal data for their own nefarious acts (spam, phishing, identity theft, ransomware, and more).

2. Decision literacy

Whether we are aware of it or not, everyone creates a “model” to guide their decisions. In my blog “Making Informed Decisions in Imperfect Situations“, I discussed how humans naturally create decision models to support their decisions, whether it’s decisions about deciding what route to take home from work, what to pick up at the grocery store, or how to pitch to a power baseball hitter like Mike Trout. And the comprehensive nature of the decision model depends upon the importance of the decision and the costs associated with making a wrong decision.

  • With a high impact decision such as buying a house, buying a car, or deciding where to go on vacation, we build fairly extensive models by gathering and assessing a wide variety of data in order to aid in making an “optimal” decision.
  • Other decision models are less impactful, so we use “rules of thumbs” or heuristic decision models to support decisions such as changing the oil in your car every 3,000 miles, seeing a dentist every 6 months, or changing your underwear at least once a week.
Figure 3: Informed Decision-making Framework

Decision Literacy is an awareness of how humans make decision models – some very comprehensive and others using “rules of thumb” depending upon the cost of the decision being wrong – to help us make more informed, more accurate, more profitable, and safer decisions.

When making decisions, how one frames the decision is everything. If you come into this process with your mind already made up (i.e., to prove or validate a decision that you have already made), then you will gravitate towards data that supports your position and fabricate reasons to ignore the data that runs counter to your position.  If you have a vested interest in a certain decision outcome, then your objectivity is threatened, and the results of your analysis are likely to be biased.

Also, the human brain is a poor decision-making tool.  Human decision-making evolved from millions of years of survival on the savanna.  Humans became very good at pattern recognition and extrapolation:  from “That looks just a harmless log behind that patch of grass,” to “Yum, that looks like an antelope!” to “YIKES, that’s actually a saber-toothed tiger!!”  Necessity dictated that we become very good at recognizing patterns and making quick, instinctive survival decisions based upon those patterns.  

To make matters worse, humans are lousy number crunchers (guess we didn’t need to crunch many numbers to spot that saber-toothed tiger). Consequently, humans have learned to rely upon heuristics, gut feel, rules of thumb, anecdotal information, and intuition as our decision models. But these decision models are inherently flawed and fail us in a world of very large, widely varied, high-velocity data sources.  

One only needs to visit Las Vegas to see our human decision-making flaws at work. Yea, casinos don’t build those magnificent monuments to human stupidity because they give away money.

Figure 4: Human Decision-Making FlawsBeyond data literacy to prediction literacy: A framework What do we need to do to increase the data literacy of our organization? In this article, I introduced the subject matter areas of the Data Literacy Educational Framework, a framework that organizations, universities, high schools, and even adult education can use to create a holistic data literacy educational curriculum. Then I deep dived into the first two subject areas of the Data Literacy Educational Framework: Subject Area #1: Data Awareness which talked about how everyone needs to be aware how their personal data is being captured and used to influence or manipulate how we think and the decisions that we make.Subject Area #2: Decision Literacy which discussed how humans make models of various complexity to make more informed and accurate decisions.In a world where your personal data, and the preferences or biases buried in that data, are being used to directly influence our behaviors, beliefs, and decisions, we must teach data literacy to EVERYONE. Otherwise, we might be persuaded into believing that the earth is flat… This article is part one of a two-part series. Bill Schmarzo is an author, educator, innovator, and influencer with a career that spans more than 30 years.
DataDecisionMakers Welcome to the VentureBeat community! DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation. If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers. You might even consider contributing an article of your own! Read More From DataDecisionMakers