hello
hello

Reputation: 67

Check if the length of values in a specific column exceeds 11

I'm trying to write a script (see below code) to check if any of the values in the 'Mobile Phone Number' column exceeds the length of 11. If there is, then print the index of this value and delete the entire record of this index from the data frame. However, the program is not doing this line of code correctly: if len(data['Mobile Phone Number']) > 11: even though the condition is met? There are two phone numbers exceeding the length of 11 that I need to delete.

import pandas as pd

data = {
    'Name': [
        'Tom',
        'Joseph',
        'Krish',
        'John'
    ],
    'Mobile Phone Number': [
        13805647925,
        145792860326480,
        184629730518469,
        18218706491
    ]
}

df = pd.DataFrame(data)

print(df)

for i in range(len(data)):
    if len(data['Mobile Phone Number']) > 11:
        print('Number at index ', i, 'is incorrect')
        data = data.drop(['Mobile Phone Number'][i], axis=1)
    else:
        print('\nNo length of > 11 found in Mobile Phone Numbers')

And this is the output of the above code:

     Name  Mobile Phone Number
0     Tom          13805647925
1  Joseph      145792860326480
2   Krish      184629730518469
3    John          18218706491

No length of > 11 found in Mobile Phone Numbers

No length of > 11 found in Mobile Phone Numbers

Upvotes: 3

Views: 2817

Answers (4)

I'mahdi
I'mahdi

Reputation: 24049

For the following Dataframe() as input:

df = pd.DataFrame({
    'Name': [
        'Tom',
        'Joseph',
        'Krish',
        'John'
    ],
    'Mobile Phone Number': [
        13805647925,
        145792860326480,
        184629730518469,
        18218706491
    ]
})

#      Name  Mobile Phone Number
# 0     Tom          13805647925
# 1  Joseph      145792860326480
# 2   Krish      184629730518469
# 3    John          18218706491

You can try this:

df = df[df['Mobile Phone Number'].apply(lambda x: len(str(x)) <= 11)]
df

To have this output:

    Name    Mobile Phone Number
0   Tom     13805647925
3   John    18218706491

Edit: if you want show error for number > 11 you can try this:

if any(df['Mobile Phone Number'].apply(lambda x: len(str(x)) > 11)):
    print("Error! you have number > 11")

Second edit : if you want to show error massage then remove number >11 use below code:

df = pd.DataFrame({
    'Name': [
        'Tom',
        'Joseph',
        'Krish',
        'John'
    ],
    'Mobile Phone Number': [
        13805647925,
        145792860326480,
        184629730518469,
        18218706491
    ]
})

print(df)

if any(df['Mobile Phone Number'].apply(lambda x: len(str(x)) > 11)):
    print("\n Error! you have number > 11 \n")
    df = df[df['Mobile Phone Number'].apply(lambda x: len(str(x)) <= 11)]

print(df)

output:

     Name  Mobile Phone Number
0     Tom          13805647925
1  Joseph      145792860326480
2   Krish      184629730518469
3    John          18218706491


 Error! you have number > 11 


   Name  Mobile Phone Number
0   Tom          13805647925
3  John          18218706491

Upvotes: 2

accdias
accdias

Reputation: 5372

This is a combination of previous answers to give the results expected by OP. Credit goes to the other authors.

import pandas as pd

df = pd.DataFrame({
    'Name': [
        'Tom',
        'Joseph',
        'Krish',
        'John'
    ],
    'Mobile Phone Number': [
        13805647925,
        145792860326480,
        184629730518469,
        18218706491
    ]
})

invalid_phones = df['Mobile Phone Number'].astype(str).apply(len).gt(11)

if invalid_phones.any():
    for _ in df[invalid_phones].index:
        print(f'Number at index {_} is incorrect')
else:
    print('No length of > 11 found in Mobile Phone Numbers')

The code above will result in the following output:

Number at index 1 is incorrect
Number at index 2 is incorrect

To remove the invalid phones from df you can use:

df = df.loc[set(df.index).difference(df[invalid_phones].index)]

or:

df = df.drop(df[invalid_phones].index)  

or even better:

df.drop(df[invalid_phones].index, inplace=True)  

That will result in the following:

print(df)
   Name  Mobile Phone Number
0   Tom          13805647925
3  John          18218706491

Upvotes: 1

ashkangh
ashkangh

Reputation: 1624

You can try this:

moblie_longer_than_11 = df[df["Mobile Phone Number"].astype(str)\
                                                    .apply(len).gt(11)].index

print(df.loc[set(df.index).difference(moblie_longer_than_11)])

Output:

    Name    Mobile Phone Number
0   Tom     13805647925
3   John    18218706491

Upvotes: 1

Alexander Volkovsky
Alexander Volkovsky

Reputation: 2918

I believe in your case you can just compare the numbers.

mask = df['Mobile Phone Number'] >= 1e11

if mask.any():
    for i in df[mask].index:
        print('Number at index ', i, 'is incorrect')
else:
    print('\nNo length of > 11 found in Mobile Phone Numbers')

Upvotes: 0

Related Questions