Reputation: 461
I like to split the column into str
and numbers.
data={"name&numb":["cat 123","34 dog","bird 93","dolphin dof 8 ","lion cat 76","tiger 22 animal "]}
df=pd.DataFrame.from_dict(data)
I did this so split the numbers
df["number"]=df["name&numb"].str.extract('(\d+)')
Now I like to make one more column so I get only string, I do not know if it will affect but in the original data, not in the English language
something like:
df["strings"]=df["name&numb"].str.extract('str')
Upvotes: 1
Views: 641
Reputation: 862501
I believe you need Series.str.extract
with \D
for non digit data with Series.str.strip
for remove trailing whitespaces:
df["number"]=df["name&numb"].str.extract('(\d+)')
df["strings"] = df["name&numb"].str.extract('(\D+)', expand=False).str.strip()
If need all strings one idea is use:
f = lambda x: ' '.join(y for y in x.split() if not y.isdigit())
df["strings1"] = df["name&numb"].apply(f)
print (df)
name&numb number strings strings1
0 cat 123 123 cat cat
1 34 dog 34 dog dog
2 bird 93 93 bird bird
3 dolphin dof 8 8 dolphin dof dolphin dof
4 lion cat 76 76 lion cat lion cat
5 tiger 22 animal 22 tiger tiger animal
Upvotes: 1