Stacey
Stacey

Reputation: 5107

Selecting columns from a data-frame based on contents of a list

I have a dataframe df that looks like:

      record_id  month    day   year   plot species    sex    wgt
0         False  False  False  False  False    True  False   True
1         False  False  False  False  False    True  False   True
2         False  False  False  False  False   False  False   True
3         False  False  False  False  False   False  False   True
4         False  False  False  False  False   False  False   True
5         False  False  False  False  False   False  False   True
6         False  False  False  False  False   False  False   True
7         False  False  False  False  False   False  False   True
8         False  False  False  False  False   False  False   True
9         False  False  False  False  False   False  False   True
10        False  False  False  False  False   False  False   True
11        False  False  False  False  False   False  False   True

I have a list called list which contains a sub set of the headers in the df which looks like: [month,plot,sex]

Is there a way to apply the list to the dataframe so only the columns from the dataframe that are in list are returned to a new dataframe. So the new dataframe would look like:

          month   plot    sex
0         False  False  False
1         False  False  False
2         False  False  False
3         False  False  False
4         False  False  False
5         False  False  False
6         False  False  False
7         False  False  False
8         False  False  False
9         False  False  False
10        False  False  False
11        False  False  False

I have tried df1= df[list] without success.

Upvotes: 2

Views: 118

Answers (3)

Bharath M Shetty
Bharath M Shetty

Reputation: 30605

List is a builtin, if you try to access the df based on builtin like df[list] that will return the entire dataframe. Its not suggested to assign any value to builtins. So as the above answers suggest store the list in a different variable name and then try doing the indexing.

i.e

df[list].head(4)
   record_id  month    day   year   plot  species    sex   wgt
0      False  False  False  False  False     True  False  True
1      False  False  False  False  False     True  False  True
2      False  False  False  False  False    False  False  True
3      False  False  False  False  False    False  False  True

if k = ['record_id','month'] then df[k] will return

  record_id  month
0      False  False
1      False  False
2      False  False
3      False  False
.
.

Upvotes: 0

BENY
BENY

Reputation: 323386

By using isin

df.loc[:,df.columns.isin(['month','plot','sex'])]
Out[165]: 
    month   plot    sex
0   False  False  False
1   False  False  False
2   False  False  False
3   False  False  False
4   False  False  False
5   False  False  False
6   False  False  False
7   False  False  False
8   False  False  False
9   False  False  False
10  False  False  False
11  False  False  False

Upvotes: 3

Scott Boston
Scott Boston

Reputation: 153550

IIUC:

l = ['month','plot','sex']

df[l]

Output:

    month   plot    sex
0   False  False  False
1   False  False  False
2   False  False  False
3   False  False  False
4   False  False  False
5   False  False  False
6   False  False  False
7   False  False  False
8   False  False  False
9   False  False  False
10  False  False  False
11  False  False  False

Upvotes: 1

Related Questions