Reputation: 161
I am trying to import a .dat file that is output from my experiments as metadata in the header lines and then the data of the experiment itself afterwards (after the line with dash lines). My idea was to strip it so that I have a list of strings variable containing the metadata and another variable as a dataframe with the results (the part below the dashes). I am having trouble trying to import the data below as data frame since the metadata above is classified as a list of strings and therefore the whole file stays in this format. Is there a way to get the data as a data frame and not as a list of strings?
Learned-Helplesness-Experiment (TriplePlatform) from 05.04.2017 13:41:24
software version: DoublePlatform_1.3 04-Jun-2014
Setup of Experiment:
Platform 1:
ExpType: M M M M M M M M M M
heated side: right right right right right right right right right right
PIs: n. def. 0 0 0 0 0 0 0 0 0
Platform 2:
ExpType: Te Te Te Y Te Y Y Y Y Y
heated side: right right right ->M right ->M ->M ->M ->M ->M
PIs: n. def. 0 0 0 0 0 0 0 0 0
Platform 3:
ExpType: Y Y Y Y M_S Y Y Y Y Y
heated side: ->M ->M ->M ->M right ->M ->M ->M ->M ->M
PIs: n. def. 0 0 0 0 0 0 0 0 0
------------------------------------ ------------------------------------
0 0 0 0 0
1 47 -0.3759766 0.1123047 0.3710938
2 97 0.01953125 -0.1318359 0.1123047
3 157 -0.4150391 0.2246094 0.3369141
4 207 -0.01953125 -0.2539063 0.1318359
5 257 -0.3515625 0.3027344 0.3222656
Upvotes: 0
Views: 334
Reputation: 1721
I guess you are using pandas? I think there is no "general" way of doing this.
You could open/parse the file manually (until the "dash lines"). The part until the dash line you keep as "list of strings". Then you tell pandas to import the rest starting with line number x
(where you found the dashes). The option is called skiprows
.
Edit1 (in response to the comment):
That depends on whether your header has a constant number of rows. If not, you might want to read through the file line by line, looking for the dashes:
with open('filename', 'r') as file:
line_no = 0
for line in file.read():
line_no += 1
if line.startswith('-'*37):
# do sth
break
else:
# do sth
Edit2
To import the data part, you could use
pandas.read_csv(..., sep='\t', skiprows=line_no)
in case tab
is the field delimiter, or
pandas.read_csv(..., delim_whitespace=True, skiprows=line_no)
if the fields are delimited by one (or more) blanks
Upvotes: 1