Feary
Feary

Reputation: 37

How to use an index to recall a label of a dataframe pandas

maybe it's a trivial question but I cant find an answer to this problem: I have a dataframe with these columns:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df.columns
Index(['label', 'num.feature 1', 'num.feature 2', 'num.feature 3',
   'num.feature 4', 'num.feature 5',...,'num.feature 30'],
  dtype='object')

I would like to find a way to access the columns using an index variable i in the name of the column

for i in range(30):
    df['num.feature **i**'].hist(bins=90,range=(0,0.4))

For example to print the various histograms for each columns. Are there better ways to do it? thank you in advance

Upvotes: 1

Views: 1320

Answers (3)

piRSquared
piRSquared

Reputation: 294218

pandas.DataFrame.filter

df.filter(regex='^num.feature \d+$').hist()

Upvotes: 1

sacuL
sacuL

Reputation: 51335

Here are two ways:

Method 1 Extract your features of interest beforehand, and iterate through those. IMO, this is cleaner, as if you're missing a feature name, it will still work (i.e. if you don't have for example num.feature 6)

features = [i for i in df.columns if i.startswith('num.feature')]

for feature in features:
    plt.hist(df[feature], bins=90, range=(0,0.4))

Method 2 Create the relevant feature names on the fly (you'll run into trouble if you're missing a feature name)

for i in range(1,31):
    plt.hist(df['num.feature '+str(i)])

Upvotes: 1

ak_slick
ak_slick

Reputation: 1016

You are pretty much there. just need to add format.

for i in range(30):
    df['num.feature {}'.format(i)].hist(bins=90,range=(0,0.4))

Should be good now.

Upvotes: 1

Related Questions