Manolo Dominguez Becerra
Manolo Dominguez Becerra

Reputation: 1363

Sort list of string numbers that are in a column of a data frame

I have a data frame whose one column contains lists of number strings:

                 Col1
                ['1']
                ['1']
        ['1','3','4']
['2','3','1','4','5']

How can I sort this number? I have tried to adapt the answer given here.

I would like to have a sorted list of integers instead of strings.

Upvotes: 0

Views: 392

Answers (3)

ThomasIsCoding
ThomasIsCoding

Reputation: 102241

You can try

df["Col1_sorted"] = list(map(lambda x: sorted(x, key=int), df["Col1"]))

which gives

              Col1      Col1_sorted
0              [1]              [1]
1              [1]              [1]
2        [1, 3, 4]        [1, 3, 4]
3  [2, 3, 1, 4, 5]  [1, 2, 3, 4, 5]

Upvotes: 0

cottontail
cottontail

Reputation: 23261

There are a few ways to do this:

1. Apply a function that does the transformation you want to every row in the column

This is the simplest and probably most efficient way to do this. Simply loop through the rows and apply a function to each list. Note that @Mayank Porwal's answer is the same as this one.

df = pd.DataFrame({'Col1':[['1'],['1'],['1','3','4'],['2','3','1','4','5']]})
df['sorted'] = [sorted(map(int, row)) for row in df['Col1'].tolist()]

2. Use pandas-native data manipulation

Another solution is to explode the values out of the lists, sort them and aggregate them back into lists.

df['sorted'] = (
    df['Col1'].explode()
    .astype(int)
    .sort_values()
    .groupby(level=0).agg(list)
)

Given the input dataframe in the OP, both solutions give the following output:

              Col1           sorted
0              [1]              [1]
1              [1]              [1]
2        [1, 3, 4]        [1, 3, 4]
3  [2, 3, 1, 4, 5]  [1, 2, 3, 4, 5]

3. There's also a way using numpy.sort but that one is too ugly, so it is omitted for brevity.

Upvotes: 2

Mayank Porwal
Mayank Porwal

Reputation: 34086

Use:

In [599]: df['Col1'] = df.Col1.apply(lambda x: sorted(map(int, x)))
In [600]: df
Out[600]: 
              Col1
0              [1]
1              [1]
2        [1, 3, 4]
3  [1, 2, 3, 4, 5]

Upvotes: 3

Related Questions