Reputation: 49
read data from line by line,
for line in sys.stdin:
print(line)
the each line input is following:
New York 100
Orlando 200
LA 300
D.C. 400
the output I want is a dataframe:
city value
0 New York 100
1 Orlando 200
2 LA 300
3 D.C. 400
the way I am doing is read the line and save all lines as a list of list, where each line content is a list
list_of_lists = []
for line in sys.stdin:
new_list = [elem for elem in line.split()]
list_of_lists.append(new_list)
and then convert this list_of_lists to a DataFrame.
I feel this way is pretty stupid, so I am wondering if there is any other way. Thanks.
Upvotes: 1
Views: 12670
Reputation: 294576
Use str.rsplit
to split from the right side and only one time
list_of_lists = []
for line in sys.stdin:
new_list = line.rsplit(1)
list_of_lists.append(new_list)
Or, put into a pandas series first
import sys, re, pandas as pd
data = sys.stdin.read().splitlines()
pd.Series(data, name='A').str.rsplit(n=1, expand=True)
Upvotes: 0
Reputation: 92904
import sys, re, pandas as pd
data = sys.stdin.read().splitlines() # obtaining the list of lines from stdin
data = [re.split(r'\s+(?=\d+$)', l) for l in data] # split each line into 2 items: `city` and `value`
df = pd.DataFrame(data, columns=['city','value']) # constructing dataframe
print(df)
The output:
city value
0 New York 100
1 Orlando 200
2 LA 300
3 D.C. 400
Upvotes: 2