Reputation: 1293
I've got a list of addresses in a single column address
, how would I go about parsing the phone number and restaurant category into new columns? My dataframe looks like this
address
0 Arnie Morton's of Chicago 435 S. La Cienega Blvd. Los Angeles 310-246-1501 Steakhouses
1 Art's Deli 12224 Ventura Blvd. Studio City 818-762-1221 Delis
2 Bel-Air Hotel 701 Stone Canyon Rd. Bel Air 310-472-1211 French Bistro
where I want to get
address | phone_number | category
0 Arnie Morton's of Chicago 435 S. La Cienega Blvd. Los Angeles | 310-246-1501 | Steakhouses
1 Art's Deli 12224 Ventura Blvd. Studio City | 818-762-1221 | Delis
2 Bel-Air Hotel 701 Stone Canyon Rd. Bel Air | 310-472-1211 | French Bistro
Does anybody have any suggestions?
Upvotes: 0
Views: 556
Reputation: 42946
str.extract
and str.split
:numbers dash numbers dash numbers
for phone_number
3 numbers followed by a space
and grab the part after it for category
. We use positive lookbehind
for this, which is ?<=
in regexdf['phone_number'] = df['address'].str.extract('(\d+-\d+-\d+)')
df['category'] = df['address'].str.split('(?<=\d{3})\s').str[-1]
Output
address phone_number category
0 Arnie Morton's of Chicago 435 S. La Cienega Blvd. Los Angeles 310-246-1501 Steakhouses 310-246-1501 Steakhouses
1 Art's Deli 12224 Ventura Blvd. Studio City 818-762-1221 Delis 818-762-1221 Delis
2 Bel-Air Hotel 701 Stone Canyon Rd. Bel Air 310-472-1211 French Bistro 310-472-1211 French Bistro
Upvotes: 1
Reputation: 82795
Try using Regex with str.extract
.
Ex:
df = pd.DataFrame({'address':["Arnie Morton's of Chicago 435 S. La Cienega Blvd. Los Angeles 310-246-1501 Steakhouses",
"Art's Deli 12224 Ventura Blvd. Studio City 818-762-1221 Delis",
"Bel-Air Hotel 701 Stone Canyon Rd. Bel Air 310-472-1211 French Bistro"]})
df[["address", "phone_number", "category"]] = df["address"].str.extract(r"(?P<address>.*?)(?P<phone_number>\b\d{3}\-\d{3}\-\d{4}\b)(?P<category>.*$)")
print(df)
Output:
address phone_number \
0 Arnie Morton's of Chicago 435 S. La Cienega Bl... 310-246-1501
1 Art's Deli 12224 Ventura Blvd. Studio City 818-762-1221
2 Bel-Air Hotel 701 Stone Canyon Rd. Bel Air 310-472-1211
category
0 Steakhouses
1 Delis
2 French Bistro
Note:: Assuming the content of address is always address--phone_number--category
Upvotes: 3