JuanMacD
JuanMacD

Reputation: 181

How to separate a list in a column by pairs to generate a list of lists

I have a column that looks like this:

             ID    len                                        range_cover
0    A0A075B734  347.0                                [36, 134, 136, 283]
1    A0A087X1C5  515.0                                [22, 328, 347, 514]
2    A0A1B0GTQ1  446.0                                [22, 116, 168, 496]
3    A0A1W2PN81  502.0    [22, 46, 48, 117, 119, 149, 152, 160, 162, 230]
4        Q494W8  412.0  [22, 36, 80, 84, 88, 91, 96, 128, 131, 139, 14...
..          ...    ...                                                ...
165      Q9UQ90  795.0                                         [303, 564]
166      Q9Y210  931.0                                           [0, 930]


And I want to divide the lists in range_cover by pairs of numbers, but I don't know how to do it. All the list are dividable by two, so this is possible for all of them.
Here's the expected output:

                                                 range_cover
                                     [[36, 134], [136, 283]]
                                     [[22, 328], [347, 514]]
                                     [[22, 116], [168, 496]]
   [[22, 46], [48, 117], [119, 149], [152, 160], [162, 230]]
[[22, 36], [80, 84], [88, 91], [96, 128], [131, 139], [14...
...
                                                   [303, 564]
                                                     [0, 930]

I thought about using zip, something like:

df2['tup'] = df2.apply(lambda x: list(zip(x.range_cover)), axis=1)

But I don't know how to tell the function to 'zip' the first number with the second one, and so on. I also thought to use .replace, but I would need the function to replace a character every 2 numbers.

Any help or advice is welcome! cheers

Upvotes: 0

Views: 94

Answers (2)

sitting_duck
sitting_duck

Reputation: 3720

Via transform() and list comprehension:

df['range_cover'].transform(lambda x: [x[i:i+2] for i in range(0,len(x),2)])

0                              [[36, 134], [136, 283]]
1                              [[22, 328], [347, 514]]
2                              [[22, 116], [168, 496]]
3    [[22, 46], [48, 117], [119, 149], [152, 160], ...

Upvotes: 1

Rob Raymond
Rob Raymond

Reputation: 31146

numpy reshape() is a simple solution for this

import json
df = pd.read_csv(io.StringIO("""             ID    len                                        range_cover
0    A0A075B734  347.0                                [36, 134, 136, 283]
1    A0A087X1C5  515.0                                [22, 328, 347, 514]
2    A0A1B0GTQ1  446.0                                [22, 116, 168, 496]
3    A0A1W2PN81  502.0    [22, 46, 48, 117, 119, 149, 152, 160, 162, 230]"""), sep="\s\s+", engine="python")

df["range_cover"] = df["range_cover"].apply(json.loads)
df["range_cover"].apply(lambda l: np.array(l).reshape(len(l)//2, 2))

Upvotes: 3

Related Questions