Natalia
Natalia

Reputation: 71

How to use function from class Python in Pandas?

I have a Python class with methods inside. One of them is public.

How to apply this method in filtering rows data in Pandas?

I meaan something like this:

class Math:
   def isTrue(value):
     return True

Pandas rule:

df[df["name"].apply(Math.isTrue)]

If values in column name are True show them in result dataframe.

Data is:

Number Name    Country
1      Vasila  US
1212     oLGA    AU
6      Fors    RE

I need to filter all rows where Number has double pair like 1212 using my custom method from class with regex.

Result should be:

Number Name    Country
1212     oLGA    AU

Upvotes: 1

Views: 62

Answers (1)

g_dzt
g_dzt

Reputation: 1478

import re

class Matcher:
    pattern = re.compile('^(?P<num_pair>\d\d)(?(num_pair)(?P=num_pair))$')
    
    @classmethod
    def has_num_pair(cls, n: int) -> bool:
        if cls.pattern.match(str(n)) is None:
            return False
        return True
df[df['Number'].apply(Matcher.has_num_pair)]

Regex explanation:

pattern = re.compile(
    '^'                    # at the beginning of the string
    '(?P<num_pair>'        # create a capturing group named "num_pair" ...
        '\d\d)'            # ... that captures two digits
    '(?(num_pair)'         # if the group "num_pair" captures something ...
        '(?P=num_pair))'   # try to match the captured content again
    '$'                    # the string must end after that
)

This pattern will match numbers that are made of a repeated pair of digits, like 1212, 9898or 3535, but it will not match numbers that include such a pair along with other digits, like 14343for example. If you want to match those too, change your regex as such:

pattern = re.compile('.*(?P<num_pair>\d\d)(?(num_pair)(?P=num_pair)).*')

This variant will also match 14343, 767689and so on.

Upvotes: 1

Related Questions