Reputation: 25
I'm a beginner when it comes to using Pandas. But I want to take the table of G-Sync Gaming Monitors in Nvidia's website here: https://www.nvidia.com/en-us/geforce/products/g-sync-monitors/specs/ and convert that to a data frame in Pandas for Python.
The first thing I tried to do was
import pandas as pd
df = pd.read_html('https://www.nvidia.com/en-us/geforce/products/g-sync-monitors/specs/')
but that didn't seem to work. I got a ValueError: No tables found.
Then I tried to do
import requests
import lxml.html as lh
page = requests.get('https://www.nvidia.com/en-us/geforce/products/g-sync-monitors/specs/')
but somehow I got ContentDecodingError: ('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing data: incorrect header check')).
If someone could explain why the first two ways didn't work and how to actually get the table into a data frame, that would be very helpful. Thank you!
Upvotes: 2
Views: 348
Reputation: 195418
The data is loaded dynamically via json request.
This script loads the json data into a dataframe and prints it:
import re
import json
import pandas as pd
url = 'https://www.nvidia.com/en-us/geforce/products/g-sync-monitors/specs/'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:76.0) Gecko/20100101 Firefox/76.0'}
html_txt = requests.get(url, headers=headers).text
json_url = 'https://www.nvidia.com' + re.search(r"'url': '(.*?)'", html_txt).group(1)
data = requests.get(json_url, headers=headers).json()
# uncomment this to print all data:
# print(json.dumps(data, indent=4))
def fn(x):
out = []
for v in x:
if isinstance(v, dict):
out.append(v['en'])
else:
out.append(v)
return out
df = pd.json_normalize(data['data'], max_level=0).apply(fn)
print(df)
Prints:
type manufacturer model hdr size lcd type resolution variable refresh rate range variable overdrive variable refresh input driver needed
0 G-SYNC ULTIMATE Acer CP7271K Yes 27 IPS 3840x2160 (4K) 1-144Hz Yes Display Port N/A
1 G-SYNC ULTIMATE Acer X27 Yes 27 IPS 3840x2160 (4K) 1-144Hz Yes Display Port N/A
2 G-SYNC ULTIMATE Acer X32 Yes 32 IPS 3840x2160 (4K) 1-144Hz Yes Display Port N/A
3 G-SYNC ULTIMATE Acer X35 Yes 35 VA 3440x1440 (WQHD) 1-200Hz Yes Display Port N/A
4 G-SYNC ULTIMATE Asus PG65 Yes 65 VA 3840x2160 (4K) 1-144Hz Yes Display Port N/A
.. ... ... ... ... ... ... ... ... ... ... ...
159 G-SYNC Compatible LG 2020 ZX Yes 77, 88 OLED 7680x4320 (8K) 40-120Hz No HDMI 445.51 or newer
160 G-SYNC Compatible MSI MAG251RX Yes 24.5 IPS 1920x1080 (FHD) 48-240Hz No Display Port 441.66 or newer
161 G-SYNC Compatible Razer Raptor 27 Yes 27 IPS 2560x1440 (QHD) 48-144Hz No Display Port 431.60 or newer
162 G-SYNC Compatible Samsung CRG5 No 27 VA 1920x1080 (FHD) 48-240Hz No Display Port 430.86 or newer
163 G-SYNC Compatible ViewSonic XG270 No 27 IPS 1920x1080 (FHD) 48-240Hz No Display Port 441.41 or newer
[164 rows x 11 columns]
Upvotes: 3