To follow this tutorial you are going to need the following libraries installed in your current environment:
I do most of my work in Jupyter Notebook and Jupyter Lab but feel free to follow in whichever IDE you prefer. Please refer to the Plotly getting started documentation, especially if you use Jupyter tools as I do. There’s a few lines of code you have to run the first time you use Plotly in Jupyter but I assure you it’s nothing complicated.
When you’re confident you can display Plotly figures you can start by importing the aforementioned libraries with their usual aliases, Plotly provides a high-level API called Plotly Express which we’ll be using.
import numpy as np
import pandas as pd
import plotly.express as px
If you are using my dataset and have it in the same working directory you can import it the following way, if you are using other dataset feel free to use your favorite method to have it as a pandas dataframe.
df = pd.read_csv(‘path_to_purchase.csv’)
Sunburst charts are fairly straightforward to build using Plotly Express, especially from a Pandas dataframe. The key things we have to specify are the dataframe using the data_frame argument, the columns to use and the hierarchy of them with the path argument and the column that will determine the size of the parts in our sunburst chart
fig = px.sunburst(
data_frame = df,
path = ['Segment', 'Customer Origin', 'First Click', 'Coupon Usage'],
values = 'Customer Count'
The code above generates the following chart:
There’s room for some formatting improvements, I won’t go into the details of the arguments but the keywords are explicit about what they do so they shouldn’t represent a problem to you.
By adding the following lines of code we can generate a better-looking chart:
data_frame = df,
path = [‘Segment’, ‘Customer Origin’, ‘First Click’, ‘Coupon Usage’],
values = ‘Customer Count’,
title = ‘Paths to purchase’,
height = 700,
template = ‘ggplot2’
One of the best aspects of Plotly is that, unlike Matplotlib, the charts it generates are interactive. This adds a usability layer that can enhance your exploratory analysis and provide that wow factor when presenting. Think Tableau or Power BI but available through Python.
By leveraging this visualizations and interactivity now we can see the following in our chart:
- Most of the customers arrive organically to our website
- Customers who arrive organically tend to shop more items without discount but customers who come from platforms ad prefer items with discount
- On top of that, customers who come from ad platforms and buy discounted items, tend to use coupons with their purchases much more than other types of customers
Of course this would trigger additional questions which we could then use to drive further analysis but the sunburst chart enabled us to perform a quick exploration of the data.