FabledFalcons
FabledFalcons

Reputation: 11

Creating table from txt with delimiter

At the moment I'm trying to automate the process of creating excel files from SAP tables. We always import a total of 50 different tables from the SAP system in the form of a txt file with the delimiter '|'. Unfortunately in some tables this delimiters appears also in the column itself and thus when using pandas I get an error message. Additionally I tried to use the pandas read_fwf, but this was not working either. Any idea on how to solve this? An example of such table is given below:

Number Date Time Request Clt Owner User Project Short Text RC
1 06.01.21 12:01:56 OD1K984128 300 FX5314 F36164 8000013814 : P|303 CAS change interface PR SAP
2 06.01.21 12:01:58 OD1K984193 300 FX5314 F36164 8000013814 : P|303 CAS change interface PR SAP (2)
3 07.01.21 12:07:49 OD1K984028 300 FX5310 F36164 8000013709, INC0242318 change pop-up label printing
4 07.01.21 12:07:50 OD1K984179 300 FX5370 F36164 8000013834 - MRP areas in the US - Adapt ZBW_KNMT
5 07.01.21 12:07:52 OD1K984202 300 F38091 F36164 8000013839, INC0208178 Adapt spare part availability
6 07.01.21 12:07:52 OD1K984288 300 F36131 F36164 8000013862 : CCK Z004 for 7502
7 07.01.21 12:07:55 OD1K984036 300 F36131 F36164 8000013797: storloc CN70
8 07.01.21 12:08:17 OD1K984345 300 FX5248 F36164 8000013880, Johannfreight auth - INC0261847
9 07.01.21 12:08:18 OD1K984347 300 VASA10053690 F36164 8000013883 2109-US6A
10 07.01.21 12:08:18 OD1K984005 300 VASA10053690 F36164 8000013884 Production Sched Profile 7502/75A2
11 07.01.21 12:08:22 OD1K984360 300 FX5248 F36164 8000013886, Auth C088 - 3101 - F|10010 - INC0262223
12 07.01.21 12:08:49 OD1K984374 300 FX5248 F36164 8000013889, CK13N - F39115 - INC0262507
13 07.01.21 12:08:50 OD1K984351 300 VASA10053690 F36164 8000013884 Production Sched |Profile|7502/75A2
14 07.01.21 12:08:51 OD1K984356 300 FX5314 F36164 8000013885 : P00303 Customs Declaration C4T (4)
15 07.01.21 12:08:52 OD1K984358 300 FX5295 F36164 8000013885, P00303 Customs Declaration C4T (4)
16 07.01.21 12:08:52 OD1K984381 300 VASA10053690 F36164 8000013893 ZAP

I'm trying to convert the data to a dataframe is not working properly if there is a delimiter ('|') within the column itself I get an error that it expected x columns but saw y columns. I know the reason of the error (since there is a delimiter inside the column) but I don't know how to tackle it.

def exporting_xlsx(file_name, xlsx_path):
#creating the dataframe
df = pd.read_csv(file_name,
        delimiter = '|',
        index_col= False,
        header=0,
        squeeze = True,
        skipinitialspace = True,
        low_memory = False,
        skip_blank_lines = True,
        on_bad_lines='warn',
        na_filter = True,
        dtype=str)

#dropping empty columns
df = df.loc[:, ~df.columns.str.contains('^Unnamed')]

#exporting the file to xlsx
df.to_excel(xlsx_path, index = None, header=True)

Executing this code on the table, results in the following error message and as a result thus not exporting each line. Additionally some part of column 1 and 2 are put in the column RC due the '|' in the column.

b'Skipping line 14: expected 13 fields, saw 14\n'

Does somebody has any idea how to tackle this issue? Thank you in advance for all the help!

Upvotes: 1

Views: 663

Answers (1)

lemuria
lemuria

Reputation: 79

There is a Python library called tabulate. Try it out.

https://pypi.org/project/tabulate/

You can convert the table to a Pandas Dataframe and then you can feed the dataframe into tabulate.tabulate().

Upvotes: 1

Related Questions