Episode 17: 3 Ways to Amp-Up Your Scatter Plot! Featured Data Viz by Maarten Lambrechts
Welcome to episode 17 of Data Viz Today. How can you take your scatter plot to the next level? Host Alli Torban dives into 3 ways you can layer onto a scatter plot to enhance your reader’s understanding of the data.
Welcome! I'm Alli Torban.
01:00 - This week’s episode is about 3 things that you can layer onto a scatter plot to enhance your reader’s understanding of the data.
01:15 - Check out artist Heidi Horchler’s Instagram - she’s creating a watercolor version of my cover art!
02:00 - Today’s featured viz is called Lazy MP’s by Maarten Lambrechts. Maarten is a freelance data journalist and visualization consultant from Belgium.
02:30 - He began this project because as a data journalist, he wanted to visualize who the lazy MP’s are in the Flemish Parliament.
03:20 - He scraped the data on the Parliament’s website using R script.
04:00 - He needed to pitch the story to his editors and thought a visual would help sell the story so he decided to use a scatter plot since he’s comparing two variables to see if there were any trends.
04:38 - He used ggplot2 in R to create the initial scatter plot and in order to help readers quickly orient themselves to where each MP fell in relation to their colleagues, Maarten added a line to represent the Median for the x-axis and another Median line for the y-axis. The median is so easy for people to understand in a scatter plot - half the dots are above and half the dots are below. This is #1 on my list of 3 ways to amp-up your scatter plot - add lines that help orient your reader quickly - this can be median lines like in this case, or a trend line so readers can quickly see what kind of correlation there might be, or you can add a line that’s specific for your data - like here’s a line representing minimum wage, and then you can easily identify the dots below or above that line.
05:35 - #2 - after Maarten added the Median lines, he then labeled each quadrant. So the bottom top right quadrant was called the busy bees (they had lots of documents filed, lots of things said), then there were the silent forces (lots of documents, few things said), the talkers (few documents, lots of talking) and, of course, the lazy MP's (few documents, few things said). Labeling the quadrants in a way that basically sums up what the implication is of being in that position. I feel like it’s such a smart way to assist your reader with interpreting the chart.
06:25 - Another layer that Maarten added to this scatter plot was his use of color to introduce a third variable, which was which party the MP is a part of. This is #3 on ways to amp-up your scatter plot - using color, size or small multiples to represent a third variable. Be careful adding a fourth variable - that could get a little complicated, and in the book Data Points by Nathan Yau, he suggests using color and size to represent the same third variable so it’s a redundant encoding to really drive home the point. Maarten used small multiples to show the party in another way.
07:45 - The whole process, from scraping to the visualisation was done in R (scraping with Rvest, visualising with ggplot2). Final touches for print were done in Illustrator.
08:18 - So what are the three ways you can amp up your scatter plot?
1. Add lines to help orient your reader to your data - this can be median lines, trend lines, or specific values that matter to your story
2. Name the quadrants - this helps your reader interpret the scatter plot quickly and understand the implications of a point being in a certain position.
3. Add a third variable by using color, size or small multiples - this helps your reader parse through the points a bit better to really understand the story. Just be careful not to encode too much - using color, size or small multiples on the same variable can be a good thing - the redundancy can be helpful to your reader.
09:00 - Maarten’s advice to designers just starting out - Although reading some books about data visualisation obviously helps to learn the basics and get inspired, the most important thing to start with visualisation is just making visualisations. Grab a visualisation you like and try to plug in your own data, or try to remake it with the tool of your choice. What can also be very helpful is getting feedback on your work. Not only from people from the data visualisation community, but also from people with no background in visualisation. It can be really insightful to hear what others see in your work, and what they don't see.
10:10 - Get mapping right away with my free-mini course “Make Your First Custom Map in Under 30 Minutes”.