PDF-Plumber Extracting title if metadata is not present

Question

I have used pdf plumber to extract the text out of pdf files as per the GitHub page (https://github.com/jsvine/pdfplumber) I went through all properties, I need to extract the title of the pdf if the metadata is not present.

or any other way we can achieve this using python

import pdfplumber
pdf = pdfplumber.open(r'1.pdf')
page = pdf.pages[0]
text = page.extract_text()
print(page.chars[0])

Shuail_CR007 · Accepted Answer

I have found the below approach

import pdfplumber
pdf = pdfplumber.open(r'1.pdf')
page = pdf.pages[0]

filtered = page.filter(lambda x: x.get("size", 0) > 20)
filtered.extract_text()

PDF-Plumber Extracting title if metadata is not present

Answers (1)

Related Questions