irahorecka
irahorecka

Reputation: 1807

(Pandas) correct lambda expression to sort column by value @ index position 1

I am attempting to sort SrcWell by the value's index position 1. I understand there is a keyword argument key, which is similar in behavior to key in sorted, however I receive a ValueError when attempting to sort using key. Here is an example CSV file to be loaded as a pandas DataFrame:

SrcPlate            SrcWell
PS000000123456      A4
PS000000123456      B7
PS000000123456      A7
PS000000123456      H6
PS000000123456      G6  
PS000000123456      F6

And a small script to sort SrcWell by its numerical values:

import pandas as pd

worklist = pd.read_csv('worklist.csv')
print(worklist.sort_values(by="SrcWell", key=lambda x: int(x[1])))

>>> [...] ValueError: invalid literal for int() with base 10: 'B7'

Upvotes: 1

Views: 837

Answers (1)

Scott Boston
Scott Boston

Reputation: 153500

Try using .str accessor and slicing:

df.sort_values(by="SrcWell", key=lambda x: x.str[1])

Output:

         SrcPlate SrcWell
0  PS000000123456      A4
3  PS000000123456      H6
4  PS000000123456      G6
5  PS000000123456      F6
1  PS000000123456      B7
2  PS000000123456      A7

As @Ben.T points out, per the documentation

key : callable, optional
Apply the key function to the values before sorting. This is similar to the key argument in the builtin sorted() function, with the notable difference that this key function should be vectorized. It should expect a Series and return a Series with the same shape as the input. It will be applied to each column in by independently.

Upvotes: 2

Related Questions