srkdb
srkdb

Reputation: 815

Relative data visualization in pandas

I have some data as follows:

+---------+-------+---------+----------------+
| Machine | Event | Outcome | Duration Total |
+---------+-------+---------+----------------+
| a       |     1 | FAIL    |           1127 |
| a       |     2 | FAIL    |          56099 |
| a       |     2 | PASS    |          15213 |
| a       |     3 | FAIL    |          13891 |
| a       |     3 | PASS    |          13934 |
| a       |     4 | FAIL    |           6844 |
| a       |     5 | FAIL    |           6449 |
| b       |     1 | FAIL    |          21331 |
| b       |     2 | FAIL    |          30362 |
| b       |     3 | FAIL    |          12194 |
| b       |     3 | PASS    |           7390 |
| b       |     4 | FAIL    |          35472 |
| b       |     4 | PASS    |           7731 |
| b       |     5 | FAIL    |           7654 |
| c       |     1 | FAIL    |          16833 |
| c       |     1 | PASS    |          21337 |
| c       |     2 | FAIL    |            440 |
| c       |     2 | PASS    |          14320 |
| c       |     3 | FAIL    |           5281 |
+---------+-------+---------+----------------+

I'm trying to make a categorical scatter plot of total duration of each event and each machine. Or any other visualization to analyze them relatively.

What would be a good choice and how to go about it?

Upvotes: 0

Views: 68

Answers (1)

Ben Pap
Ben Pap

Reputation: 2579

import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x = 'Event', y = 'Duration', hue = 'Machine', col = 'Outcome', data = df)

Give this a try, its two scatter plots. X axis is the event, y axis is Duration, color of the dots is based on the machine, and there is two graphs, one for fail and next to it is another for pass. "df" is your dataframe. You can remove col = 'Outcome' to have both Fail and Pass on the same graph.

EDIT:

fig, ax = plt.subplots(figsize = (10,10))
g = sns.scatterplot(x = 'Event', y = 'Duration', hue = 'Machine', data = df[df['Outcome'] == 'PASS'], ax = ax)
g = sns.scatterplot(x = 'Event', y = 'Duration', hue = 'Machine', data = df[df['Outcome'] == 'FAIL'], ax = ax, 
                    style = 'Machine', markers = ['x', 'x', 'x'])

handles, labels = ax.get_legend_handles_labels()
ax.legend(handles, ['Machine - Pass', 'a' ,'b', 'c', 'Machine - Fail', 'a','b','c'])

plt.show()

Upvotes: 1

Related Questions