Reputation: 163
I have a data set having a dedicated column for capturing phone numbers. My task is to validate the same since there are wrong entries like "9999999999","0123456789" and many others of similar nature.
I thought of tackling the issue by identifying the carrier names, so easily above instances can be ignored as there won't be any carrier names.
I came across a package called phonenumbers
, and used the below code
import phonenumbers
from phonenumbers import carrier
ro_number = phonenumbers.parse("+91xxxxxxxxxx") # number is redacted purposely
carrier.name_for_number(ro_number, "en")
Which gave the output as 'BSNL MOBILE'
I wanted to run this on the entire column of the dataframe, where a new column is created and against each number carrier name is recorded.
I tried to use for
loop,
for i in df['phone_number']:
ro_number = phonenumbers.parse(i)
carrier.name_for_number(ro_number, "en")
But got the below error
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-80-af01b9d8c9ef> in <module>
1 for i in merged_Data['SELLER_NUMBER']:
----> 2 ro_number = phonenumbers.parse(i)
3 carrier.name_for_number(ro_number, "en")
~\anaconda3\lib\site-packages\phonenumbers\phonenumberutil.py in parse(number, region, keep_raw_input, numobj, _check_region)
2834 raise NumberParseException(NumberParseException.NOT_A_NUMBER,
2835 "The phone number supplied was None.")
-> 2836 elif len(number) > _MAX_INPUT_STRING_LENGTH:
2837 raise NumberParseException(NumberParseException.TOO_LONG,
2838 "The string supplied was too long to parse.")
TypeError: object of type 'int' has no len()
Not sure if that is the right way to go about to iterate over entire column. Help would be much appreciated.
Upvotes: 1
Views: 1244
Reputation: 17166
Made two code mods:
Code
import phonenumbers
from phonenumbers import carrier
def valid_number(number, region = "US"):
''' check validity of phone numbers (default to US region)
Used default region as US since some numbers did not work using None
'''
# Parsing String to Phone number
phone_number = phonenumbers.parse(number, region)
# Validating a phone number (i.e. it's in an assigned exchange)
return phonenumbers.is_valid_number(phone_number)
Test With List
data = ["+442083661177", "+123456789", "18004444444"]
for i in data:
print(i, valid_number(i))
# Output
+442083661177 True
+123456789 False
18004444444 True # note: this number doesn't work with default region = None
Test With DataFrame
df = pd.DataFrame({"phone_number": data})
df['valid'] = df['phone_number'].apply(valid_number)
# Resulting df
phone_number valid
0 +442083661177 True
1 +123456789 False
2 18004444444 True
Upvotes: 1
Reputation:
TypeError: object of type 'int' has no len()
That error suggests that you're trying to call len() on an int. You should convert to a string first:
len(str(x))
Upvotes: 1