Fahim
Fahim

Reputation: 348

Encoding error when printing data from clipboard, but works when the data is hardcoded

I'm trying to copy all text data from an Amazon search result page (say the search term is laptop), using Ctrl+A, Ctrl+C through PyAutoGui. Then get the data using either pyperclip.paste() or pd.read_clipboard() and print it. Here's the code:

import pyautogui
import time
import pyperclip
import pandas as pd

keyword = 'laptop'

time.sleep(3)
pyautogui.click(x=750, y=135)
time.sleep(1)
pyautogui.write(keyword)
time.sleep(1)
pyautogui.press('enter')
time.sleep(5)
pyautogui.hotkey('ctrl', 'a')
pyautogui.hotkey('ctrl', 'c')
time.sleep(0.1)

#raw = pyperclip.paste()
raw = pd.read_clipboard()

print(raw)

Using Pandas gives this error:

Traceback (most recent call last):
  File "c:\Users\smfah\OneDrive\Desktop\tmp\regex.py", line 32, in <module>
    raw = pd.read_clipboard()
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\clipboards.py", line 88, in read_clipboard
    return read_csv(StringIO(text), sep=sep, **kwargs)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\util\_decorators.py", line 211, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\util\_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\readers.py", line 950, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\readers.py", line 611, in _read
    return parser.read(nrows)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\readers.py", line 1778, in read
    ) = self._engine.read(  # type: ignore[attr-defined]
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\python_parser.py", line 282, in read
    alldata = self._rows_to_cols(content)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\python_parser.py", line 1045, in _rows_to_cols
    self._alert_malformed(msg, row_num + 1)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\python_parser.py", line 765, in _alert_malformed
    raise ParserError(msg)
pandas.errors.ParserError: Expected 4 fields in line 726, saw 7. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

And using Pyperclip gives this error:

Traceback (most recent call last):
  File "c:\Users\smfah\OneDrive\Desktop\tmp\regex.py", line 45, in <module>
    print(raw)
  File "C:\Users\smfah\AppData\Local\Programs\Python\Python310\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u200c' in position 60: character maps to <undefined>

However, if I hardcode the text on the code editor (using VSCode on Win11), and don't print it, I can work (e.g. applying regex) using the hardcoded data.

text = '''long block of text'''

But I want to work on the text copied into the clipboard. I tried applying various solutions, but none worked for me.

Note: This issue is not happening on Ubuntu 22.4, so looks like Windows related issue.

Any help will be greatly appreciated! Thanks!

Upvotes: 1

Views: 109

Answers (1)

Abrar Nazib
Abrar Nazib

Reputation: 86

Windows clipboards could be accessed with win32clipboard which is a part of winpy group. To get the latest text from clipboard,

import win32clipboard

# get clipboard data
win32clipboard.OpenClipboard()
data = win32clipboard.GetClipboardData()
win32clipboard.CloseClipboard()
print(data)

You don't need to install winpy or win32clipboard as they come with the default python installation.

Upvotes: 1

Related Questions