Creating a Year-wise Bar Chart Visualization from CSV Data

Problem:

I'm working on a data visualization project where I want to create a bar chart similar to the one shown in this reference image. The image is from a story available here.

My Effort:

I've written Python code using pandas, seaborn, and matplotlib to visualize the data. Here's my code snippet:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

box = pd.read_csv("box_office_18_23.csv")

# Data Cleaning
box["overall_gross"] = box["overall_gross"].str.replace("$", "").str.replace(",", "").astype(int)

# Data Analysis
sns.barplot(x='year', y='overall_gross', data=box)
plt.show()

Output:

Here's the output of my code: Output Image

Link to Code and Dataset:

I have uploaded my Jupyter Notebook and the relevant dataset (CSV file) to this Google Drive link.

Issue:

While my code runs without errors, the resulting bar chart doesn't match the desired visualization. I'm looking for guidance on how to modify my code to achieve a similar year-wise bar chart as shown in the reference image.

Also if other libraries or tools can do the job , let me know that too.

Thank you for your help!

Upvotes: 2

Views: 101

Answers (1)

r-beginners
r-beginners

Reputation: 35230

The following is what you will create with Plotly's library. First, to prepare the data, the date is added from the year and week number to make a time series data on the x-axis. Next, weekly totals and a list of weekly movie names are created and combined to create the graph data. Graph Customization:

  • Partial change of marker color
  • Annotation of the names of the TOP 6 films in the total
  • Annotation of source
  • Change the format of the y-axis scale
  • Movement of legend position, etc.
import pandas as pd

box = pd.read_csv('./data/box_office_18_23.csv')
box["overall_gross"] = box["overall_gross"].str.replace("$", "").str.replace(",", "").astype(int)
import datetime
box['yyyy-mm-dd'] = pd.to_datetime(box['year'].astype(str) + '-' +box['week_no'].astype(str) + "-1", format='%G-%V-%w')
box_yearWeek = box[['yyyy-mm-dd','overall_gross']].groupby(['yyyy-mm-dd']).sum()
box_topRelease = box[['yyyy-mm-dd','top_release']].groupby(['yyyy-mm-dd'])['top_release'].apply(list)
box_merge = box_yearWeek.merge(box_topRelease, left_index=True, right_index=True) 
box_merge.reset_index(inplace=True)
annotations = box_merge.sort_values('overall_gross', ascending=False)

import plotly.graph_objects as go

fig = go.Figure()
fig.add_trace(go.Bar(
    x=box_merge['yyyy-mm-dd'],
    y=box_merge['overall_gross'],
    marker=dict(color='navy'),
    name='overall_gross',
    showlegend=True
))

# bar_marker "red" for "The Super Mario Bros. Movie"
colors = ['navy']*len(box_merge)
colors[273] ='red' # 273 is index
fig.update_traces(marker_color=colors)

# Annotation: Overall_gross TOP 6 
for row in annotations[['overall_gross','top_release']][:6].itertuples():
    fig.add_annotation(
        x=box_merge.loc[row[0], 'yyyy-mm-dd'],
        y=row[1],
        text=row[2][0],
        showarrow=True,
        arrowhead=1,
    )
# Annotation: "Source"
fig.add_annotation(xref='paper',
                   x=-0.03,
                   yref='paper',
                   y=-0.08,
                   text='Source:xxxx',
                   showarrow=False)

# Yaxis tickformat custome
fig.update_yaxes(tickformat='$.2s')

# lgend position move
fig.update_layout(legend=dict(
    orientation="h",
    yanchor="bottom",
    y=0.9,
    xanchor="right",
    x=0.9
))
# title font-family font-size, backgroud-color,margein etc.
fig.update_layout(template='plotly_white',
                  title_text='Creating a Year-wise Bar Chart <br>Visualization from CSV Data',
                  title_font=dict(family='Rockwell', color='lightseagreen',size=48),
                  width=800,
                  height=800,
                  plot_bgcolor='rgba(224,255,255,0.5)',
                  paper_bgcolor='rgba(224,255,255,0.5)',
                  margin=dict(t=150,b=60,l=0,r=0)
                 )
fig.show()

enter image description here

Upvotes: 1

Related Questions