Reputation: 11
I am new here I am trying to parse web site (get table values) but nothing is happening I still get a error. I really need your help.
code:
from imp import source_from_cache
from urllib import response
from bs4 import BeautifulSoup as bs
import requests
import re
import pandas
from urllib.request import urlopen
import urllib3
from selenium import webdriver
url1 = "https://www.nordpoolgroup.com/en/Market-data1/Dayahead/Area-Prices/LT/Hourly/?view=table"
r = requests.get(url1)
print(r)
soup = bs(r.text, "html.parser")
print(soup.title.string)
print("--------------------------------------------------------------")
a = soup.find('table', {"id" : "datatable"} )
rows = a.find_all('th')
for row in rows:
print(row.get_text())
Upvotes: 0
Views: 54
Reputation: 195408
The data you see is loaded with JavaScript from different URL - so beautifulsoup
doesn't see it (you can see the URL in Firefox/Chrome developer tools).
import requests
import pandas as pd
api_url = "https://www.nordpoolgroup.com/api/marketdata/page/53"
params = {"currency": ",EUR,EUR,EUR"}
data = requests.get(api_url, params=params).json()
vals = []
index = []
for r in data["data"]["Rows"]:
index.append(r["Name"].replace(" ", " "))
vals.append([d["Value"] for d in r["Columns"]])
columns = [c["Name"] for c in data["data"]["Rows"][0]["Columns"]]
df = pd.DataFrame(vals, index=index, columns=columns)
print(df.to_markdown())
Prints:
09-08-2022 | 08-08-2022 | 07-08-2022 | 06-08-2022 | 05-08-2022 | 04-08-2022 | 03-08-2022 | 02-08-2022 | |
---|---|---|---|---|---|---|---|---|
00 - 01 | 330,76 | 310,08 | 277,89 | 361,76 | 348,80 | 450,30 | 382,25 | 440,98 |
01 - 02 | 298,92 | 270,38 | 232,85 | 315,05 | 358,18 | 414,26 | 369,62 | 390,10 |
02 - 03 | 298,95 | 280,93 | 218,28 | 308,59 | 189,96 | 375,07 | 342,09 | 382,10 |
03 - 04 | 289,20 | 157,26 | 206,40 | 276,45 | 156,48 | 348,49 | 326,30 | 359,89 |
04 - 05 | 295,75 | 273,79 | 200,47 | 267,76 | 156,45 | 345,30 | 329,96 | 352,69 |
05 - 06 | 320,01 | 310,38 | 205,23 | 254,08 | 390,05 | 375,07 | 369,25 | 393,03 |
06 - 07 | 382,92 | 462,89 | 205,00 | 290,83 | 449,91 | 447,98 | 441,82 | 457,04 |
07 - 08 | 414,98 | 798,32 | 206,29 | 315,03 | 468,00 | 480,10 | 455,34 | 479,90 |
08 - 09 | 426,22 | 861,14 | 162,27 | 309,56 | 483,91 | 500,04 | 450,58 | 478,90 |
09 - 10 | 383,99 | 574,09 | 142,70 | 233,32 | 448,10 | 502,79 | 342,81 | 418,47 |
10 - 11 | 329,93 | 443,96 | 100,00 | 402,27 | 428,79 | 502,76 | 317,01 | 380,78 |
11 - 12 | 327,52 | 750,03 | 87,69 | 347,03 | 406,69 | 502,72 | 548,38 | 373,51 |
12 - 13 | 317,76 | 792,19 | 409,31 | 157,86 | 383,10 | 510,01 | 461,72 | 333,18 |
13 - 14 | 300,00 | 797,98 | 78,65 | 98,74 | 344,57 | 502,71 | 456,51 | 347,48 |
14 - 15 | 325,30 | 447,93 | 79,80 | 157,81 | 300,07 | 510,00 | 594,30 | 336,60 |
15 - 16 | 294,46 | 478,96 | 87,11 | 119,40 | 182,36 | 484,14 | 519,35 | 295,17 |
16 - 17 | 272,94 | 466,93 | 103,50 | 123,00 | 233,05 | 438,90 | 374,10 | 387,10 |
17 - 18 | 348,20 | 479,64 | 199,42 | 292,79 | 573,03 | 362,09 | 446,14 | 340,74 |
18 - 19 | 425,05 | 861,11 | 301,67 | 223,44 | 520,00 | 495,62 | 506,04 | 445,97 |
19 - 20 | 469,80 | 797,99 | 393,30 | 302,61 | 455,00 | 514,54 | 541,54 | 511,75 |
20 - 21 | 469,72 | 495,22 | 421,99 | 355,98 | 323,95 | 504,46 | 547,47 | 521,84 |
21 - 22 | 428,67 | 455,00 | 419,92 | 334,48 | 430,34 | 495,58 | 500,08 | 521,10 |
22 - 23 | 406,45 | 429,15 | 401,86 | 369,08 | 429,91 | 454,52 | 501,77 | 489,70 |
23 - 00 | 343,31 | 391,20 | 359,42 | 324,98 | 386,55 | 397,02 | 453,80 | 443,88 |
Min | 272,94 | 157,26 | 78,65 | 98,74 | 156,45 | 345,30 | 317,01 | 295,17 |
Max | 469,80 | 861,14 | 421,99 | 402,27 | 573,03 | 514,54 | 594,30 | 521,84 |
Avg | 354,20 | 516,11 | 229,21 | 272,58 | 368,64 | 454,77 | 440,76 | 411,75 |
Peak | 351,76 | 646,00 | 178,79 | 230,65 | 396,56 | 485,53 | 463,21 | 387,47 |
Off-peak 1 | 328,94 | 358,00 | 219,05 | 298,69 | 314,73 | 404,57 | 377,08 | 406,97 |
Off-peak 2 | 412,04 | 442,64 | 400,80 | 346,13 | 392,69 | 462,90 | 500,78 | 494,13 |
Upvotes: 1