Tobitor
Tobitor

Reputation: 1508

How can I get such a plot? Do not know the type of plot it is

I have data which looks like this:

file | timestamps
1 | 02/01/1970 
1 | 03/01/1970 
1 | 04/01/1970
1 | 05/01/1970 
2 | 06/01/1970
2 | 07/01/1970
3 | 08/01/1970
3 | 09/01/1970
3 | 10/01/1970

On the x-axis I would like to have the number of rows per file. On the y-axis I would like to have timestamps. It should look similar to this plot but I do not know how to get this plot. Is this a waterfall plot?

unknown

Upvotes: 0

Views: 71

Answers (1)

Pierre Debaisieux
Pierre Debaisieux

Reputation: 178

Not a lot of data, but this is the result with your example

import matplotlib.pyplot as plt
import matplotlib.patches as patches
import pandas as pd

data = [[1, '02/01/1970'],
        [1, '03/01/1970'],
        [1, '04/01/1970'],
        [1, '05/01/1970'],
        [2, '06/01/1970'],
        [2, '07/01/1970'],
        [3, '08/01/1970'],
        [3, '09/01/1970'],
        [3, '10/01/1970']]

df = pd.DataFrame(data, columns = ['file', 'timestamps'])
df['timestamps'] = pd.to_datetime(df['timestamps'], format = '%d/%m/%Y')

tot_delta_d = 0
tot_file = 0

fig, ax = plt.subplots()

for f in df['file'].unique():
  delta_d = df[df['file'] == f]['timestamps'].max() - df[df['file'] == f]['timestamps'].min()

  rect = patches.Rectangle((tot_delta_d, tot_file),
                           delta_d.days,
                           df[df['file'] == f].shape[0],
                           color='indigo')

  ax.add_patch(rect)

  tot_delta_d += delta_d.days
  tot_file += df[df['file'] == f].shape[0]
  
plt.xlim([0, tot_delta_d])
plt.ylim([0, tot_file])
ax.set_xlabel('Parquets')
ax.set_ylabel('Timestamps')
ax.invert_yaxis()

plt.show()

output :

output

Upvotes: 1

Related Questions