Reputation: 95
I would like to write clean code to read and compile multiple files with relatively lower maintenance and improved readability, but I am missing something here.
Namely after updating the file names :
#update the names of the infiles
infile1 = 'file1.txt'
infile2 = 'file2.txt'
...
infile4 = 'file4.txt'
I would like to turn this working step :
# read fixed width file
df1 = pd.read_fwf(infile1,
header=None,
widths=[sample widths],
names=[sample names here]
)
...
...
df4 = pd.read_fwf(infile4,
header=None,
widths=[sample widths],
names=[sample names here]
)
df=pd.concat([df1,df2,df3,df4])
where [sample widths]
and [sample names here]
are specific to my file and quite lengthy,
into something easier to read and maintain:
# DESIRED FORM
for i in [1,2,3,4]:
df\i = pd.read_fwf(f'infile{i}',
header=None,
widths=[sample widths],
names=[sample names here]
)
df=pd.concat([df1,df2,df3,df4])
I feel I'm close but am missing something simple here related to how I'm writing my loop. I am getting this error when I run it
df\i = pd.read_fwf('infile'f'{i}',
^
SyntaxError: unexpected character after line continuation character
Thank you.
Upvotes: 0
Views: 350
Reputation: 482
Hi & welcome to Stack Overflow!
First you could load filenames (or longer path if you need) to a list. After that set a initial data frame with file_1 data and append the rest of the files into the created dataframe:
infiles = ['file_1.txt', ..., 'file_n.txt']
df = pd.read_fwf(infiles[0], header=None, widths=[sample widths],
names=[sample names here])
for i in range(1, len(infiles)):
temp_df = pd.read_fwf(infiles[i], header=None, widths=[sample widths],
names=[sample names here])
df.append(temp_df)
Upvotes: 1