Cyberjoe
Cyberjoe

Reputation: 3

How to convert values in list of strings into Pandas DataFrame

I would like to convert this list of strings into a Pandas DataFrame with columns ‘Open’, ‘High’, ‘Low’, ‘Close’, ‘PeriodVolume’, OpenInterest’ and ‘Datetime’ as index. How can I extract the values and create the DataFrame? Thanks for your help!

['RequestId: , Datetime: 5/28/2020 12:00:00 AM, High: 323.44, Low: 315.63, Open: 316.77, Close: 318.25, PeriodVolume: 33449103, OpenInterest: 0',

 'RequestId: , Datetime: 5/27/2020 12:00:00 AM, High: 318.71, Low: 313.09, Open: 316.14, Close: 318.11, PeriodVolume: 28236274, OpenInterest: 0',

 'RequestId: , Datetime: 5/26/2020 12:00:00 AM, High: 324.24, Low: 316.5, Open: 323.5, Close: 316.73, PeriodVolume: 31380454, OpenInterest: 0',

 'RequestId: , Datetime: 5/22/2020 12:00:00 AM, High: 319.23, Low: 315.35, Open: 315.77, Close: 318.89, PeriodVolume: 20450754, OpenInterest: 0']

Upvotes: 0

Views: 130

Answers (2)

Yuni Naveen
Yuni Naveen

Reputation: 1

import pandas as pd
ls=['10', '75', '25', '100', '50', '5.5,5/22/2020', '12:00:00 AM']
dataframe=pd.DataFrame(ls,columns=[‘Open’, ‘High’, ‘Low’, ‘Close’, ‘PeriodVolume’, 'OpenInterest’‘Datetime’])
print(dataframe)

Upvotes: 0

Rhys Flook
Rhys Flook

Reputation: 195

You can use split() and some for loops put your data into a dictionary and then pass the dictionary to a dataframe.

import pandas as pd

# First create the list containing your entries.
entries = [
    'RequestId: , Datetime: 5/28/2020 12:00:00 AM, High: 323.44, Low: 315.63,' \
    ' Open: 316.77, Close: 318.25, PeriodVolume: 33449103, OpenInterest: 0',
    'RequestId: , Datetime: 5/27/2020 12:00:00 AM, High: 318.71, Low: 313.09,' \
    ' Open: 316.14, Close: 318.11, PeriodVolume: 28236274, OpenInterest: 0',
    'RequestId: , Datetime: 5/26/2020 12:00:00 AM, High: 324.24, Low: 316.5,' \
    ' Open: 323.5, Close: 316.73, PeriodVolume: 31380454, OpenInterest: 0',
    'RequestId: , Datetime: 5/22/2020 12:00:00 AM, High: 319.23, Low: 315.35,' \
    ' Open: 315.77, Close: 318.89, PeriodVolume: 20450754, OpenInterest: 0'
]

# Next create a dictionary in which we will store the data after processing.
data = {
    'Datetime': [], 'Open': [], 'High': [], 'Low': [],
    'Close': [], 'PeriodVolume': [], 'OpenInterest': []
}
# Now split your entries by ','
split_entries = [entry.split(',') for entry in entries]

# Loop over the list
for entry in split_entries:
    # and loop over each of the inner lists
    for ent in entry:
        # Split by ': ' to get the 'key'
        # I have added the [1:] as there is a space before each
        # column name which needs to be cut out for this to work.
        key = ent.split(': ')[0][1:]

        # Now we check if the key is in the keys of the dictionary
        # we created earlier and append the value to the list
        # associated with that key if so.
        if key in data.keys():
            data[key].append(ent.split(': ')[1])

# Now we can pass the data into panda's DataFrame class
dataframe = pd.DataFrame(data)

# Then call one more method to set the index
dataframe = dataframe.set_index('Datetime')

Upvotes: 1

Related Questions