JD2775
JD2775

Reputation: 3801

Plotting multiple bar graphs in dynamic Python dataframe

I have a dataframe 'grid' that looks like this:

COLUMN_NM    DISTINCT_COUNT    MAX_COL_VALUE    MIN_COL_VALUE   NULL_COUNT
COL_A         123                 456                111              56
COL_B         15678               222                4                 3456
COL_C         18994               456                76               43               
...

The data in COLUMN_NM is dynamic as this DataFrame gets loaded with different tables for analysis. What I want to do is graph the current data that resides in the DataFrame. I want a bar graph for DISTINCT_COUNT another for MAX_COL_VALUE etc...all per column. So the COLUMN_NM would be represented along the x-axis

What I have so far is incorrect clearly, but you get some idea of what I am trying to do.

distinct = grid[('COLUMN_NM', 'DISTINCT_COUNT')].plot(kind=bar)
max_col = grid[('COLUMN_NM', 'MAX_COL_VALUE')].plot(kind=bar)
min_col = grid[('COLUMN_NM', 'MIN_COL_VALUE')].plot(kind=bar)
null_cnt = grid[('COLUMN_NM', 'NULL_COUNT')].plot(kind=bar)

I have all the necessary import statements. I want the output to be 4 graphs, and I can specify more bar chart parameters after I get this working. Also, would it be easier to wrap this in a for loop, or function?

Upvotes: 0

Views: 408

Answers (1)

sacuL
sacuL

Reputation: 51335

Yes, I'd recommend doing this in a loop:

for col in ['DISTINCT_COUNT', 'MAX_COL_VALUE', 'MIN_COL_VALUE', 'NULL_COUNT']:
    grid[['COLUMN_NM', col]].set_index('COLUMN_NM').plot.bar(title=col)

The issues with your code were:

  • grid[('COLUMN_NM', 'DISTINCT_COUNT')] won't work because you are using a tuple, instead of [(...)] you want [[...]] to select a subset of columns
  • You also want to set the column by which you want your bars grouped (COLUMN_NM) as the index

Upvotes: 2

Related Questions