Reputation: 37
I'm trying to define a function that will create a column and clean the numbers to just their ten digit area code and number. The Date frame.
PNum1
0 18888888888
1 1999999999
2 +++(112)31243134
I have all the individual functions and even stored them into a DataFrame and Dictionary.
def GetGoodNumbers(col):
column = col.copy()
Cleaned = column.replace('\D+', '', regex=True)
NumberCount = Cleaned.astype(str).str.len()
FirstNumber = Cleaned.astype(str).str[0]
SummaryNum = {'Number':Cleaned,'First':FirstNumber,'Count':NumberCount}
df = pd.DataFrame(data=SummaryNum)
DecentNumbers = []
return df
returns
Count First Number
0 11 1 18888888888
1 10 3 3999999999
2 11 2 11231243134
How can I loop through the dataframe column and return a new column that will: -remove all non-digits. -get the length (which will be usually 10 or 11) -If length is 11, return the right 10 digits.
The desired output:
number
1231243134
1999999999
8888888888
Upvotes: 1
Views: 1107
Reputation: 402413
You can remove every non-digit and slice the last 10 digits.
df.PNum1.str.replace('\D+', '').str[-10:]
0 8888888888
1 1999999999
2 1231243134
Name: PNum1, dtype: object
Upvotes: 1