Pieter DS
Pieter DS

Reputation: 19

Python pandas iterate over specific columns

I'm trying to make a bar chart of some specific columns of a dataset:

for kolom_naam in attributen_dataset:
if kolom_naam in categorische_var:
    print(kolom_naam)
    attributen_dataset.kolom_naam.value_counts().plot(kind='bar')

where attributen_dataset is a large dataframe and categorische_var is a list containing names of columns in the attributen_dataset (the names in the list are strings)

i don't know the correct syntax for selecten a column using the 'kolom_naam' iteration var. The rest works because the print(kolom_naam) does exactly what it says.

Thanks!!!!!

Upvotes: 0

Views: 206

Answers (2)

MEdwin
MEdwin

Reputation: 2960

I have made another version based on comment you made in the old answer. In this version, the bar charts are created in loop based on the columns in categorische_var : which is a filtered list of the original dataframe attributen_dataset. So you have separate bar charts created for each column you want.

Let me know if it works.

see the mockup below:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

y = np.random.rand(10,4)
y[:,0]= np.arange(10)

attributen_dataset = pd.DataFrame(y, columns=["X", "A", "B", "C"])

categorische_var=['A', 'C']
fig, axes = plt.subplots(1,len(categorische_var), figsize=(12,3))
for kolom_naam in attributen_dataset:
    for i, kolom_naam in enumerate(categorische_var):
        attributen_dataset[kolom_naam].plot(ax=axes[i], kind='bar')

Upvotes: 0

MEdwin
MEdwin

Reputation: 2960

I have tried to do a full mockup using random values. Here I have used categorische_var to filter for column A and C.

Let me know if it works for you:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

y = np.random.rand(10,4)
y[:,0]= np.arange(10)



attributen_dataset = pd.DataFrame(y, columns=["X", "A", "B", "C"])

categorische_var=['A', 'C']


for kolom_naam in attributen_dataset:
    if kolom_naam in categorische_var:
        print(kolom_naam)
        #attributen_dataset.kolom_naam.value_counts().plot(kind='bar')

df_new = attributen_dataset[categorische_var]        
df_new.plot(kind="bar") 

Upvotes: 2

Related Questions