Reputation: 131
I am using FPDF to convert text to PDF. When I write into the PDF the headers look way off from the original text. I came up with solution to go line by line and position them. I have column headers that starts at "Account#" and Ends at the "-------". How can I make changes to all the headers while keeping the data under it the same?
Original Text: https://flic.kr/p/2hw2Zft
PDF : https://flic.kr/p/2hw43hQ
pdf = FPDF("L", "mm", "A4")
pdf.add_page()
pdf.set_font('arial', style='', size=10.0)
with open('C:\\Users\\bxt058y\\PycharmProjects\\MSIT501\\SUMB_Statement_29396-
76397.txt', 'r') as file:
lines = file.readlines()
for line in lines:
pdf.multi_cell(h=5.0, align='L', w=0, txt=line, border=0)
pdf.output('drafttest.pdf', 'F')
header1 = lines[0]
header2 = lines[1]
header3 = lines[2]
header4 = lines[3]
header5_1 = " ".join(lines[4].split()[:4])
print(header5_1)
header5_2 = " ".join(lines[4].split()[4:])
print(header5_2)
header6 = lines[5]
header7 = lines[6]
print(header_find)
header8 = lines[7]
header8_1 = " ".join(lines[8].split()[:4])
header8_2 = " ".join(lines[8].split()[4:])
print(header8_2)
header9_1 = " ".join(lines[9].split()[:5])
header9_2 = " ".join(lines[9].split()[5:])
pdf.cell(ln=1, h=5.0, align='L', w=0, txt=header1.strip(), border=0)
pdf.set_x(124)
pdf.cell(ln=1, h=5.0, align='L', w=0, txt=header2.strip(), border=0)
pdf.cell(ln=1, h=5.0, align='L', w=0, txt=header3.strip(), border=0)
pdf.set_x(65)
pdf.cell(ln=1, h=5.0, align='L', w=0, txt=header4, border=0)
pdf.set_x(45)
pdf.cell(ln=0, h=5.0, align='L', w=0, txt=header5_1, border=0)
pdf.set_x(129)
pdf.cell(ln=1, h=5.0, align='L', w=0, txt=header5_2, border=0)
pdf.cell(ln=1, h=5.0, align='L', w=0, txt=header6.strip(), border=0)
pdf.cell(ln=1, h=5.0, align='L', w=0, txt=header7.strip(), border=0)
pdf.cell(ln=0, h=5.0, align='L', w=0, txt=header8_1, border=0)
pdf.set_x(125)
pdf.cell(ln=1, h=5.0, align='L', w=0, txt=header8_2, border=0)
pdf.cell(ln=0, h=5.0, align='L', w=0, txt=header9_1, border=0)
pdf.set_x(125)
pdf.cell(ln=1, h=5.0, align='L', w=0, txt=header9_2, border=0)
Upvotes: 1
Views: 98
Reputation: 43169
Look into regular expressions (and mind the different modifiers, namely singleline
, multiline
and verbose
):
^
Account\#
.+?
(?=^---)
The expression must be done on the whole string / file content. See a demo on regex101.com.
Upvotes: 1
Reputation: 1242
Maybe:
import pandas as pd
data = pd.read_csv('text.txt', header = None)
header = ['Account#', '-----']
header_only = data[data.iloc[:,0].isin(header)]
where header contains the first elemts of the header rows you are looking for
Upvotes: 1