Welcome to episode 23 of Data Viz Today. How can visualizing streaks in your data help you find patterns and gain perspective? Host Alli Torban dives into specific ways you can see your data in a new way by highlighting the period of time that something is happening - a streak! Featured data visualization by Frank Elavsky perfectly demonstrates how streaks are not only beautiful, but can also reveal patterns and new questions.
Welcome! I'm Alli Torban.
00:55 - Today’s featured data viz project is called “Hot Streaks” by Frank Elavsky
01:03 - Frank’s a data visualization specialist who collaborates with and trains researchers at Northwestern University.
01:37 - He was connected to the researcher Dashun Wang (Professor at Kellogg School of Management at Northwestern) to create a data visualization for his recent research results on creative hot streaks - the idea that winning begets more winnings.
02:00 - They looked at the careers of thousands of movie directors, artists and scientists and studied their hot streaks - the periods when someone’s performance was substantially better than their typical performance, this was determined by the top 3 IMdb score of movies for directors, the top 3 sales for artists and the 3 most cited publications for scientists.
02:30 - Almost everyone experienced at least one hot streak during their career, and it usually lasted a few years, but there’s no way to determine when it’ll happen.
02:45 - They wanted to create a viz that was beautiful and would captivate their audience, but also convey how ambitious the model and the datasets were.
03:07 - Frank went through a very iterative process, doing about 40 different sketches before sending a few over to the research team for feedback.
03:23 - He started to code one line represented each person’s career, and he stacked them on top of each other so they all had the same beginning and end point. He highlighted each hot streak in organge-gold colors (which was inspired by God of War concept art), but leaving it like this made it look like everyone had a hot streak that lasted their entire career.
03:45 - Frank explained that it’s a result of perceptual biases that we have in our visual system: we see brightness as a visual feature much more readily than we see darkness.
04:00 - So he knew that he had to sort the data meaningfully in order to make the design meaningful. He ended up using a sorting algorithm that he called “streak middle,” which finds the first time someone begins a hot streak and then the last time someone ends a hot streak. We then find the point in their career that is between these two, and sort based on that position, from people whose "streak-middle" was early-career to those whose streak-middle was late career.
04:45 - This is what led to the final viz where you could start to see a pattern that suggested that directors tend to streak much earlier than Artists and Scientists, while Scientists seem to streak the latest. The viz was not only cool to look at, it actually raised new questions for the research team - why do people in certain careers tend to streak earlier in their careers? Can they build a model to explore this relationship?
05:30 - He said the hardest part though was that they had a very short window for completion, so he had to rapidly iterate and make a lot of decisions about which features to drop.
06:25 - Other applications of streaks:
Sports - stack the football seasons for one team on top of each other and see if there’s a pattern to when your team has a winning or losing streak
Marketing - like streaks of high traffic to your website every week. Are there patterns throughout the year? If you’re looking to launch a product, maybe you want to time it with your longest streaks of heavy traffic.
Or visualize streaks in gambling, financial markets, video gaming, or tracking health related streaks like visualizing the days you go without smoking - is there a pattern in when your streaks occur or are broken?
Example about my podcast release day - I could visualize the hot streaks of downloads every week by stacking every week on top of each other, and visualize the 3 consecutive days with the highest number of downloads in every week, and see if there’s a pattern or general trend… Maybe my download streaks are actually over the weekend so it’d be better to release then to better fit your guy’s actual listening habits. If I felt like I needed to experiment with something like that, then first analyzing the pattern in my download streaks might be a good first step.
08:20 - Inspired viz about putting recent rain streaks and rainfall in perspective for July in the D.C. area.
09:10 - I downloaded the daily rainfall from 1948 to 2018 for the Reagan National weather station from the National Oceanic and Atmospheric Administration website
09:24 - After I downloaded the data, I added a new column to represent streaks. I used an excel formula to count consecutive rain days, and then pulled my csv into tableau.
09:36 - Andy Kriebel’s Visual Vocabulary with a bunch of useful charts that you can download!
10:25 - I struggled with simply showing rain streaks and amount of rainfall. I’m still iterating!
11:20 - You can immediately take away that there’s not really any pattern and that the recent rain streak was pretty long and heavy compared to past years but not the biggest.
11:37 - So my final takeaway is that next time you have data over time, think about whether visualizing a hot streak could reveal any useful patterns or insight or give you a different perspective on your data. Line up your data with the same start and end points and stack em up!
11:55 - Frank’s advice to newbie designers: learn to sketch out your data viz, and more importantly, consult with your audience - it’s the best way for you to know how effective your data viz is - you need to know how your intended audience is perceiving and understanding your message in order to make a really powerful data visualization.
12:26 -Follow Frank on Twitter!
Have you ever created a streak data viz? Comment below! :)