Infographics have been prevalent for quite a while, but few publishers or users really grasp data visualization. (Check out the New York Times “The Year in Graphics“) You see pretty charts with color graphics, but after a close look into the data, it’s hard to tell what the visualized data convey. (Check out this infographic http://tinyurl.com/ch29u2j)
Recently my coworker Steven and I picked a U.S. college dataset for a data visualization competition.
There are 35 columns and more than 300 entries with information from school name, mascot, class enrollment size, ethnic diversity, starting salary, ROI, and high job meaning. Our task was to visualize the data and transform it into a compelling story (or a functional tool). At first, we wanted to include as much information as possible, analyzing correlation between variables like acceptance rate and 6-year graduation rate. Our initial attempt ended up with a whole bunch of scatter plots with very low R squared values (correlation between variables), and the message we tried to deliver wasn’t very clear at all.
But after a few tries we decided to go with a new approach in which we thought of ourselves as high school graduates. Then we began to ask ourselves questions like, “How do I know what a good return on investment is?” “Will my career choice have meaning in 10 or 20 years? We narrowed down our scope, repurposed our approach, and came up with “The Perfect Fit: Which School is Right for You?” (You can see our data visualization here http://bit.ly/1554kzH)
The interactive graphic is designed based on four criteria: distance from home, applicants’ SAT / ACT score, financial limits, and class size. As you go through each of the criteria and make your selection, the tool filters out unwanted input and presents you with a list of school to choose from.
It was a great learning process, and we felt pretty good about the final product. Creating the data visualization, I concluded three takeaways for anyone who wishes to make their own Infographics meaningful:
1. Make assumptions. Develop hypotheses. And don’t be afraid to challenge yourself.
No matter how much data you have, it’s still impossible to tell a holistic story because your audiences are different. When there is too much information, you ended up with “garbage in, garbage out” situation. Therefore, it is very important to take a stance and list out all the data that matters. Then dive into finding interesting correlations or causation.
2. Sometimes, descriptive data is enough.
There are many types of data analyses: descriptive, inferential, and predictive, etc. But most infographics you find online are limited to descriptive purpose. Rather than making the information too complicated to comprehend, you should be comfortable with the idea that sometimes data doesn’t have to have a relationship with another.
3. The simpler, the better (the importance of scrapping unrelated information)
It’s hard to get rid of data in the early stage because you are tempted to think you can develop so many insights with the amount of information you have. But think about this: how many points can you actually deliver in one infographics? You could create one infinite webpage and simply let your reader scroll, but this is likely to prevent the reader from getting anything out of the data set.
Have you ever created your own infographics? What tools do you use and how do you make your infographics meaningful?