How do I get which columns of a row are within some values in Pandas?

Question

I know this has to be out there in the ether, but I cannot find it. I am fluent in R, trying to figure out Pandas, and it is making me want to throw this PC out of a window. It has been a long day.

I want to be able to extract the column names of a dataframe, based on the values in the columns of some row:

foo = pd.DataFrame(
[[-1,-5,3,0,-5,8,1,2]],
columns = ('a','b','c','d','e','f','g','h')
)

foo
Out[25]: 
   a  b  c  d  e  f  g  h
0 -1 -5  3  0 -5  8  1  2

I would like to get a vector I can subset some other dataframe by:

foo >= 0

Gives me another dataframe, which I cannot use to subset a vector (series? whatever you people refer to it as??)

I want to do something like this:

otherDF[ foo >= 0 ]

Thoughts???

dataflow · Accepted Answer

You just need to use loc (e.g. df.loc[:,columns])

import pandas as pd
import numpy as np

cols = ('a','b','c','d','e','f','g','h')
foo = pd.DataFrame(
[[-1,-5,3,0,-5,8,1,2]],
columns = cols)

bar = pd.DataFrame(np.random.randint(0, 10, (3, len(cols))), columns=cols)

print foo

   a  b  c  d  e  f  g  h
0 -1 -5  3  0 -5  8  1  2

print bar

   a  b  c  d  e  f  g  h
0  7  9  2  9  5  3  2  9
1  5  7  4  1  5  1  4  0
2  4  9  1  3  3  7  0  2


columns_boolean = foo.iloc[0] >= 0
columns_to_keep = foo.columns[columns_boolean]

print bar.loc[:, columns_to_keep] 


   c  d  f  g  h
0  2  9  3  2  9
1  4  1  1  4  0
2  1  3  7  0  2

Alternatively, if your other dataframe doesn't have the same column names but has the same number of columns, you can still use "loc" but just pass in the boolean array of which columns to keep:

bar.loc[:, columns_boolean.values]



  c  d  f  g  h
0  7  2  6  3  9
1  4  3  8  0  3
2  5  7  1  3  0

How do I get which columns of a row are within some values in Pandas?

Answers (2)

Related Questions