Reputation: 2005
I have a text file containing data like this, formatted in a list, where the first element is a string containing the column names sepparated by ';', and the next elements are the value rows:
['Timestamp;T;Pressure [bar];Input line pressure [bar];Speed [rpm];Angular Position [degree];Wheel speed [rpm];Wheel angular position [degree];',
';1;5,281;5,303;219,727;10,283;216,363;45;',
';1;5,273;5,277;219,727;11,602;216,363;45;',
';1;5,288;5,293;205,078;12,832;216,363;45;',
';1;5,316;5,297;219,727;14,15;216,363;45;',
';1;5,314;5,307;219,727;15,469;216,363;45;',
';1;5,288;5,3;219,727;16,787;216,363;45;',
';1;5,318000000000001;5,31;219,727;18,105;216,363;45;',
';1;5,304;5,3;219,727;19,424;216,388;56,25;',
';1;5,291;5,29;219,947;20,742;216,388;56,25;',
';1;5,316;5,297;219,507;22,061;216,388;56,25;']
How can I convert this list of text into a pandas dataframe?
Upvotes: 4
Views: 31684
Reputation: 1
First you can create variable read_file
and use pandas.read_csv()
function to open it. Then you transform it to csv file with read_file.to_csv()
function. After that you will open dataframe with pd.read_csv()
.
read_file = pd.read_csv('variable.txt', sep = ';')
df = read_file.to_csv ('variable.csv', index=None)
df = pd.read_csv('variable.csv')
I believe answers to same/similar problems can be found here: Load data from txt with pandas
Upvotes: 0
Reputation: 1
If there are just comma separated values as output to your model - you can use this to convert into a pandas dataframe (content is your output in streamlit app)
out = [line.split(",") for line in content.strip().split("\n")]
df1 = pd.DataFrame(out)
df1.columns = df1.iloc[0]
df1 = df1.reindex(df1.index.drop(0))
st.write(df1)
Upvotes: 0
Reputation: 171
Shorter base on @Nihal solution
df = [n.split(';') for n in raw_data_text]
df = pd.DataFrame(df[1:], columns=df[0])
Upvotes: 0
Reputation: 29337
You could use the function from_records()
splitting each string item in the input list and taking care of the fact that the first line of your data contains the columns' labels
>>> data = ['Timestamp;T;Pressure [bar];Input line pressure [bar];Speed \
[rpm];Angular Position [degree];Wheel speed [rpm];Wheel angular position [degree];', \
';1;5,281;5,303;219,727;10,283;216,363;45;', \
';1;5,273;5,277;219,727;11,602;216,363;45;', \
';1;5,288;5,293;205,078;12,832;216,363;45;', \
';1;5,316;5,297;219,727;14,15;216,363;45;', \
';1;5,314;5,307;219,727;15,469;216,363;45;', \
';1;5,288;5,3;219,727;16,787;216,363;45;', \
';1;5,318000000000001;5,31;219,727;18,105;216,363;45;', \
';1;5,304;5,3;219,727;19,424;216,388;56,25;', \
';1;5,291;5,29;219,947;20,742;216,388;56,25;', \
';1;5,316;5,297;219,507;22,061;216,388;56,25;']
>>> df = pd.DataFrame.from_records([r.split(';') for r in data[1:]], columns=data[0].split(';'))
>>> df
Timestamp T Pressure [bar] Input line pressure [bar] Speed [rpm] \
0 1 5,281 5,303 219,727
1 1 5,273 5,277 219,727
2 1 5,288 5,293 205,078
3 1 5,316 5,297 219,727
4 1 5,314 5,307 219,727
5 1 5,288 5,3 219,727
6 1 5,318000000000001 5,31 219,727
7 1 5,304 5,3 219,727
8 1 5,291 5,29 219,947
9 1 5,316 5,297 219,507
...
Upvotes: 0
Reputation: 3713
Use pd.read_csv
, that reads dataframe from text files, and pd.compat.StringIO
, that makes stream from text, like io.StingIO
:
pd.read_csv(pd.compat.StringIO("\n".join(lines)), sep=";")
Upvotes: 8
Reputation: 5334
code:
df = [
'Timestamp;T;Pressure [bar];Input line pressure [bar];Speed [rpm];Angular Position [degree];Wheel speed [rpm];Wheel angular position [degree];',
';1;5,281;5,303;219,727;10,283;216,363;45;',
';1;5,273;5,277;219,727;11,602;216,363;45;',
';1;5,288;5,293;205,078;12,832;216,363;45;',
';1;5,316;5,297;219,727;14,15;216,363;45;',
';1;5,314;5,307;219,727;15,469;216,363;45;',
';1;5,288;5,3;219,727;16,787;216,363;45;',
';1;5,318000000000001;5,31;219,727;18,105;216,363;45;',
';1;5,304;5,3;219,727;19,424;216,388;56,25;',
';1;5,291;5,29;219,947;20,742;216,388;56,25;',
';1;5,316;5,297;219,507;22,061;216,388;56,25;']
mat = [n.split(';') for n in df]
print(mat)
newdf1 = pd.DataFrame(mat)
newdf1.columns = newdf1.iloc[0]
newdf1 = newdf1.reindex(newdf1.index.drop(0))
# newdf2 = pd.DataFrame.from_dict(df)
print(newdf1)
output:
0 Timestamp T Pressure [bar] Input line pressure [bar] Speed [rpm] \
1 1 5,281 5,303 219,727
2 1 5,273 5,277 219,727
3 1 5,288 5,293 205,078
4 1 5,316 5,297 219,727
5 1 5,314 5,307 219,727
6 1 5,288 5,3 219,727
7 1 5,318000000000001 5,31 219,727
8 1 5,304 5,3 219,727
9 1 5,291 5,29 219,947
10 1 5,316 5,297 219,507
0 Angular Position [degree] Wheel speed [rpm] \
1 10,283 216,363
2 11,602 216,363
3 12,832 216,363
4 14,15 216,363
5 15,469 216,363
6 16,787 216,363
7 18,105 216,363
8 19,424 216,388
9 20,742 216,388
10 22,061 216,388
0 Wheel angular position [degree]
1 45
2 45
3 45
4 45
5 45
6 45
7 45
8 56,25
9 56,25
10 56,25
Upvotes: 6