Applying function with regex to pandas dataframe

Question

I have the following excerpt of a dataframe df[2]:

    CAR DRIVER START   PIT 1    PIT 2     PIT 3     PIT 4     PIT 5
0    31    OCO    In  In (2)  C3n (3)  C2n (37)         0         0
1     5    VET    In  In (2)  C3n (3)  C2n (36)         0         0
2    44    HAM    In  In (2)  C3u (4)  C2n (19)  C3n (47)         0
3    55    SAI    In  In (2)  C3n (3)  C2n (32)         0         0
4    14    ALO    In  In (2)  C3n (3)  C2n (39)         0         0

I am trying to change the "PIT.." columns (The 3rd column onwards, by inspection) values, for e.g. from "C3n (3)" to "C3n-3". I have written the following code to achieve this:

df[2].iloc[:, 3:] = df[2].iloc[:, 3:].apply(lambda x: re.compile(r'\w+').findall(str(x))
              [0] + "-" + re.compile(r'\w+').findall(str(x))[1] if len(str(x)) > 1 else 0)

The output from the following code is:

    CAR DRIVER START PIT 1  PIT 2  PIT 3 PIT 4 PIT 5
0    31    OCO    In  0-In  0-C3n  0-C2n   0-0   0-0
1     5    VET    In  0-In  0-C3n  0-C2n   0-0   0-0
2    44    HAM    In  0-In  0-C3n  0-C2n   0-0   0-0
3    55    SAI    In  0-In  0-C3n  0-C2n   0-0   0-0
4    14    ALO    In  0-In  0-C3n  0-C2n   0-0   0-0

,which is incorrect, as all column values are the same. The regex I have implemented in the lambda is correct when tested as an independent function but not when I apply it to the dataframe.

Is this a problem with my dataframe slicing, as that is the only problem I am unsure of? Any help would be appreciated.

Corralien · Accepted Answer

You can use filter to select PIT.. columns and replace like this:

>>> df.filter(like='PIT').replace(r' $(\d+)$', r'-\1', regex=True)

  PIT 1  PIT 2   PIT 3   PIT 4  PIT 5
0  In-2  C3n-3  C2n-37       0      0
1  In-2  C3n-3  C2n-36       0      0
2  In-2  C3u-4  C2n-19  C3n-47      0
3  In-2  C3n-3  C2n-32       0      0
4  In-2  C3n-3  C2n-39       0      0

For the regex, refer to the documentation for the (\d+) and \1

Applying function with regex to pandas dataframe

Answers (1)

Related Questions