khouzam
khouzam

Reputation: 273

Filter by the number of digits pandas

I have a Dataframe that has only one column with numbers ranging from 1 to 10000000000.

    df1 = 
165437890
2321434256
324334567
4326457
243567869
234567843
 ......
7654356785432
7654324567543

I want to have a resulting Dataframe that only has numbers with 9 digits, and that those digits are different from each other, is this possible ? I don't have a clue on how to start

OBS: 1) I need to filter out the number that has repeated digits

for example :

122234543 would go out of my DataFrame since it has the number 2 repeated 3 times and the numbers 4 and 3 repeated 2 times

Upvotes: 0

Views: 1411

Answers (2)

Marek
Marek

Reputation: 708

flt = (df.Numbers >= 100000000) & (df.Numbers < 1000000000)
df = pd.DataFrame(df[flt]['Numbers'].unique())

Where Numbers is the column name with your numbers.


Solution for digits that are different from each other in the number itself:

df.Numbers = df.Numbers.astype('str')
df = df[df.Numbers.str.match(r'^(?!.*(.).*\1)[0-9]{9}$')]

Or another solution based on the Igor's answer:

def has_unique_9digits(n):
    s = str(n)
    return len(s) == len(set(s)) == 9
df = df[df.Numbers.apply(has_unique_9digits)]

Upvotes: 2

Igor Rivin
Igor Rivin

Reputation: 4864

def is_good(num):
    numstr = list(str(num))
    if len(numstr) == 9 and len(set(numstr))==9:
       return True
    return False

    df1[df.apply(is_good)]

Upvotes: 2

Related Questions