Reputation: 37
maybe it's a trivial question but I cant find an answer to this problem: I have a dataframe with these columns:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df.columns
Index(['label', 'num.feature 1', 'num.feature 2', 'num.feature 3',
'num.feature 4', 'num.feature 5',...,'num.feature 30'],
dtype='object')
I would like to find a way to access the columns using an index variable i in the name of the column
for i in range(30):
df['num.feature **i**'].hist(bins=90,range=(0,0.4))
For example to print the various histograms for each columns. Are there better ways to do it? thank you in advance
Upvotes: 1
Views: 1320
Reputation: 294218
pandas.DataFrame.filter
df.filter(regex='^num.feature \d+$').hist()
Upvotes: 1
Reputation: 51335
Here are two ways:
Method 1 Extract your features of interest beforehand, and iterate through those. IMO, this is cleaner, as if you're missing a feature name, it will still work (i.e. if you don't have for example num.feature 6
)
features = [i for i in df.columns if i.startswith('num.feature')]
for feature in features:
plt.hist(df[feature], bins=90, range=(0,0.4))
Method 2 Create the relevant feature names on the fly (you'll run into trouble if you're missing a feature name)
for i in range(1,31):
plt.hist(df['num.feature '+str(i)])
Upvotes: 1
Reputation: 1016
You are pretty much there. just need to add format.
for i in range(30):
df['num.feature {}'.format(i)].hist(bins=90,range=(0,0.4))
Should be good now.
Upvotes: 1