beginner_
beginner_

Reputation: 7622

pandas Series: find column by value

I have a pandas series and a value. I want to find which column contains my value and get the columns name. The value is guaranteed to appear exactly once.

Column A| Column B | Column C
   rtz       xyz          abc

And my value is abc I want Column C as return value.

How can I do that?

EDIT:

I have a series so anything using df.columns etc. will not work as series has no columns attribute.

Upvotes: 1

Views: 1031

Answers (3)

beginner_
beginner_

Reputation: 7622

So while waiting for answers I came up with my own seemingly naive answer:

for col_name, value in s.items():
    if value == 'abc':
        //do stuff
        break

Simply iterating the series in a for loop. Yeah I'm a bit old-school not yet much accustomed to functional programming paradigms. therefore I also tend to ask for potentially better solutions.

I then had a feeling after applying jezraels solution that my script got slower. (the specific code is called a lot) I expected a speed up over simple looping but turns out I was very wrong:

s = pd.Series(['rtz','abc','xyz'], index=list('ABC'))  

%%timeit
next(iter(s[s == 'abc'].index), 'no match')
224 µs ± 3.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%%timeit
for col_name, value in s.items():
    if value == 'abc':
        col_name
        break
15.9 µs ± 159 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Simple for loop is over 10x times faster for this specific use-case (small series, eg around 20 items in my use-case). That is pretty puzzling actually. So if performance matters and you have a tiny series I guess this is the way to go.

Upvotes: 0

jezrael
jezrael

Reputation: 862911

Use numpy.where for positions of matched values, indexing and then use next with iter trick for return first matched value, it working also if no match:

i, c = np.where(df == 'abc')
print (c)
[2]

print (next(iter(df.columns[c]), 'no match'))
Column C

i, c = np.where(df == 'abc1')
print (c)
[]

print (next(iter(df.columns[c]), 'no match'))
no match

EDIT: Solution with Series:

s = pd.Series(['rtz','xyz','abc'], index=list('ABC'))
    
print (next(iter(s[s == 'abc'].index), 'no match'))
C

print (next(iter(s[s == 'abc1'].index), 'no match'))
no match

Alternative:

print (next(iter(s.index[s == 'abc']), 'no match'))
C

print (next(iter(s.index[s == 'abc1']), 'no match'))
no match

Upvotes: 4

wwnde
wwnde

Reputation: 26676

Lets try select string value. Using the .iloc accessor extract the column array/list and call the element in list using [index]

(df.columns[(df=='abc').iloc[0]])[0]

Upvotes: 1

Related Questions