Reputation: 59
When I'm trying to run my (after deploying with pyinstaller) program for reading and converting a PDF file and entering it into a google sheet. I get the error shown in the image below. However I can not seem to figure out what the problem is:
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\site-packages\textract\parsers\utils.py", line 82, in run
pipe = subprocess.Popen(
File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\subprocess.py", line 854, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\subprocess.py", line 1307, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\tkinter\__init__.py", line 1883, in __call__
return self.func(*args)
File "EinkaufRGWindows.py", line 40, in InkoopRekeningen
text = textract.process(str(importfolder) + str(i))
File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\site-packages\textract\parsers\__init__.py", line 77, in process
return parser.process(filename, encoding, **kwargs)
File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\site-packages\textract\parsers\utils.py", line 46, in process
byte_string = self.extract(filename, **kwargs)
File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\site-packages\textract\parsers\pdf_parser.py", line 28, in extract
raise ex
File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\site-packages\textract\parsers\pdf_parser.py", line 20, in extract
return self.extract_pdftotext(filename, **kwargs)
File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\site-packages\textract\parsers\pdf_parser.py", line 43, in extract_pdftotext
stdout, _ = self.run(args)
File "C:\Users\trpfinance\AppData\Local\Programs\Python\Python38-32\lib\site-packages\textract\parsers\utils.py", line 90, in run
raise exceptions.ShellError(
textract.exceptions.ShellError: The command `pdftotext //Mac/Home/Desktop/Wickey Einkauf Test/Rekeningen/Lekkerkerker_ - 20803471.pdf -` failed with exit code 127
------------- stdout -------------
------------- stderr -------------
Upvotes: 3
Views: 10537
Reputation: 783
I had the same issue. It seems to be an OS issue. For me, switching to GIT bash worked. https://github.com/deanmalmgren/textract/issues/229
If you are using Pycharm, change default terminal to bash.
Upvotes: 2
Reputation: 9
You need to install pdftotext using pip. To install it you need to have Microsoft Visual C++ 14 or greater.
Upvotes: 0
Reputation: 2086
You're getting a FileNotFoundError
it seems. If you look at the error, the command being run is:
pdftotext //Mac/Home/Desktop/Wickey Einkauf Test/Rekeningen/Lekkerkerker_ -
0803471.pdf -
There are a couple of things here I would look at. Firstly, there is an extra slash at the start of your file path, which seems wrong. Secondly, you have spaces in the file path, but there are no quotations enclosing the path. This second part means pdftotext
will read this as a few separate command arguments, rather than one. You can fix this by formatting you subprocess call to have the file wrapped in quotation marks, like so:
pdftotext "example file path.pdf" -
Upvotes: 0