How to parse string as a pandas dataframe

Question

I'm trying to build a self-contained Jupyter notebook that parses a long address string into a pandas dataframe for demonstration purposes. Currently I'm having to highlight the entire string and use pd.read_clipboard:

data = pd.read_clipboard(f,
                  comment='#', 
                  header=None, 
                  names=['address']).values.reshape(-1, 2)

matched_address = pd.DataFrame(data, columns=['addr_zagat', 'addr_fodor'])

I'm wondering if there is an easier way to read the string in directly instead of relying on having something copied to the clipboard. Here are the first few lines of the string for reference:

f = """###################################################################################################
# 
#   There are 112 matches between the tuples.  The Zagat tuple is listed first, 
#   and then its Fodors pair.
#
###################################################################################################

Arnie Morton's of Chicago 435 S. La Cienega Blvd. Los Angeles 90048 310-246-1501 Steakhouses

Arnie Morton's of Chicago 435 S. La Cienega Blvd. Los Angeles 90048 310/246-1501 American
########################

Art's Deli 12224 Ventura Blvd. Studio City 91604 818-762-1221 Delis

Art's Delicatessen 12224 Ventura Blvd. Studio City 91604 818/762-1221 American
########################

Bel-Air Hotel 701 Stone Canyon Rd. Bel Air 90077 310-472-1211 Californian

Hotel Bel-Air 701 Stone Canyon Rd. Bel Air 90077 310/472-1211 Californian
########################

Cafe Bizou 14016 Ventura Blvd. Sherman Oaks 91423 818-788-3536 French Bistro

Cafe Bizou 14016 Ventura Blvd. Sherman Oaks 91423 818/788-3536 French
########################
h Bistro

Cafe Bizou 14016 Ventura Blvd. Sherman Oaks 91423 818/788-3536 French
########################"""

Does anybody have any tips as to how to parse this string directly into a pandas dataframe?

I realise there is another question that addresses this here: Create Pandas DataFrame from a string but the string is delimited by a semi colon and totally different to the format used in my example.

How to parse string as a pandas dataframe

Answers (1)

Related Questions