Reputation: 117
I have been tasked with using a SAP custom reports that are given to me in a text file - sample below (removed names and made all values 0.00 for this sample). The task is to use this data and create xlsx files from this data. I have 100s of these txt files in a specific folder. How would I load this into python and create an xlsx file that saves the header data in same location/format above the table for each file?
I have the code below that creates the xlsx for each file but the format is all messed up. I need it in better format for each cell/column/row.
Any help is greatly appreciated!
Current Code:
import glob
import os
from xlsxwriter import Workbook
filepath = 'mypath'
txtfiles = glob.glob(os.path.join(filepath, '*z*.txt*'))
def is_number(s):
try:
float(s)
return True
except ValueError:
return False
for filename in txtfiles:
readfiles = open(filename, 'r')
row_list = []
for row in readfiles:
row_list.append(row.split('\n'))
column_list = zip(*row_list)
workbook = Workbook(filename.replace('.txt', '.xlsx'))
worksheet = workbook.add_worksheet('Sheet1')
i = 0
for column in column_list:
for item in range(len(column)):
value = column[item].strip()
if is_number(value):
worksheet.write(item, i, float(value))
else:
worksheet.write(item, i, value)
i += 1
workbook.close()
Sample report below:
SAMPLE REPORT TEMPLATE
Page Number: 1 of 1 Time of Output:06:37:00
Author of Report:ME Date of Output:09/27/2018
Ledger:SAMPLE Version: 1
Currency: USD Fiscal Year:2018
Report Group:RANDOM Period: 0 to 10
|. | Outside MONEY | Outside MONEY2 | Outside MONEY3 | Subtotal MONIES |
|------------------------------------------------------------|---------------------|---------------------|---------------------|---------------------|
| INCOME MONIES BEFORE CERTAIN CALCULATIONS SAMPLE | 0.00 | 0.00 | 0.00 | 0.00 |
|------------------------------------------------------------|---------------------|---------------------|---------------------|---------------------|
| 1 - Line Data 1 | 0.00 | 0.00 | 0.00 | 0.00 |
| 1 - Line Data 2 | 0.00 | 0.00 | 0.00 | 0.00 |
| 1 - Line Data 3 | 0.00 | 0.00 | 0.00 | 0.00 |
| 1 - Line Data 4 | 0.00 | 0.00 | 0.00 | 0.00 |
| 1 - Line Data 5 | 0.00 | 0.00 | 0.00 | 0.00 |
| 2 - Line Data 1 | 0.00 | 0.00 | 0.00 | 0.00 |
| 2 - Line Data 2 | 0.00 | 0.00 | 0.00 | 0.00 |
| 2 - Line Data 3 | 0.00 | 0.00 | 0.00 | 0.00 |
|* Sample Random Line W/ Star | 0.00 | 0.00 | 0.00 | 0.00 |
| 3 - Line Data 1 | 0.00 | 0.00 | 0.00 | 0.00 |
| 3 - Line Data 2 | 0.00 | 0.00 | 0.00 | 0.00 |
| 3 - Line Data 3 | 0.00 | 0.00 | 0.00 | 0.00 |
| 3 - Line Data 4 | 0.00 | 0.00 | 0.00 | 0.00 |
| 3 - Line Data 5 | 0.00 | 0.00 | 0.00 | 0.00 |
| 3 - Line Data 6 | 0.00 | 0.00 | 0.00 | 0.00 |
| 3 - Line Data 7 | 0.00 | 0.00 | 0.00 | 0.00 |
| 3 - Line Data 8 | 0.00 | 0.00 | 0.00 | 0.00 |
Upvotes: 1
Views: 218
Reputation: 141
Since your text report fields are of a fixed length, you could extract the header with one function and the data with another one. So, in your sample each row of data is 151 characters. Since the header is much shorter, you can easily know if the row in the current loop is from the header or it is data.
I would suggest to split the row by ':' if it is header and by '|' if it is data. By the end of the loop, you will have each row with the fields properly separated.
If you have further trouble with managing the data, or you need more detail, I suggest regex module.
Upvotes: 1
Reputation: 3465
Reading from a text file and add each line as an item to a list can be done as follows:
lines = []
with open('input_file.txt', 'r') as textinputfile:
for readline in textinputfile:
lines.append(readline)
Upvotes: 1