LFlowers203
LFlowers203

Reputation: 23

Using python to extract data from a chart

I am trying to extract only the rented prices (green dots) from this site but I can't find where the data is coming from. I want to use beautifulsoup or scrapy to do the web scraping. Is the data being imported as a JSON or how is it appearing on the website? Apologizes for such a broad question I am relatively new to python programming language. There are URLs in source code that may lead to the data but I can't figure it out. Any push in the right direction would be so appreciated.

Here is the website: https://www.redweek.com/whats-my-timeshare-worth/P5035-wyndham-bonnet-creek-resort/rental-historical

Upvotes: 2

Views: 467

Answers (2)

chitown88
chitown88

Reputation: 28565

Hamza said where to find it. But re-iterate, when you are on the site, right-click and select "Inspect" (or Ctrl-Shift-I). In the right pannel you'll find it in Network, XHR, Headers (you may need to reload the page once you have the panel open)

enter image description here

Here's the code to turn that json to a table:

import pandas as pd
import requests

url = 'https://www.redweek.com/whats-my-timeshare-worth/xhr?resort_id=5035&type=rental&active=0'
headers = {'User-Agent': 'Mozilla/5.0'}
jsonData = requests.get(url, headers=headers).json()

cols = {}
for idx, each in enumerate(jsonData['cols']):
    cols.update({idx:each['label']})
cols.update({0:'Week'})

rows = []
for row in jsonData['rows']:
    temp_row = {}
    for idx, each in enumerate(row['c']):
        w=1
        temp_row.update({cols[idx]:each['v']})
    rows.append(temp_row)
    
df = pd.DataFrame(rows)

df['Price'] = df['Rented'].fillna(df['Unknown'])
df = df.drop(['Not Rented','Active posting','Rented','Unknown'],axis=1)

Output:

print(df)
      Bedrooms   Status  Week  Price
0            1  Unknown    52  120.0
1            1  Unknown    52  130.0
2            1  Unknown    53  120.0
3            1  Unknown     1   60.0
4            1  Unknown     3  140.0
5            1  Unknown     5  100.0
6            1  Unknown    11  170.0
7            1  Unknown    11   90.0
8            1  Unknown    20   90.0
9            1  Unknown    22  130.0
10           1  Unknown    23  100.0
11           1  Unknown    24  100.0
12           1  Unknown    24  180.0
13           1  Unknown    25  100.0
14           1  Unknown    27   90.0
15           1  Unknown    28   90.0
16           1  Unknown    29   90.0
17           1  Unknown    30   90.0
18           1  Unknown    47  100.0
19           1  Unknown    52  100.0
20           1  Unknown     1  140.0
21           1  Unknown    10  140.0
22           1  Unknown    12  130.0
23           1  Unknown    14  100.0
24           1  Unknown    14  160.0
25           1  Unknown    26  110.0
26           1  Unknown    34   90.0
27           1  Unknown    39  140.0
28           1  Unknown    43  160.0
29           1  Unknown    51  100.0
       ...      ...   ...    ...
4035         3   Rented    12  250.0
4036         3   Rented    13  270.0
4037         3   Rented    14  230.0
4038         3   Rented    18  280.0
4039         3   Rented    27  180.0
4040         3   Rented    35   90.0
4041         4   Rented    53  330.0
4042         4   Rented    15  170.0
4043         4   Rented    14  310.0
4044         4   Rented    18  250.0
4045         4   Rented    19  250.0
4046         4   Rented    46  300.0
4047         4   Rented    18  250.0
4048         4   Rented    19  200.0
4049         4   Rented     8  190.0
4050         4   Rented    12  240.0
4051         4   Rented    27  200.0
4052         4   Rented     7  240.0
4053         4   Rented    18  200.0
4054         4   Rented    47  310.0
4055         4   Rented    45  210.0
4056         4   Rented     7  320.0
4057         4   Rented    51  320.0
4058         4   Rented     9  300.0
4059         4   Rented    15  220.0
4060         4   Rented    39  210.0
4061         4   Rented    41  280.0
4062         4   Rented     4  200.0
4063         4   Rented     5  260.0
4064         4   Rented    35  130.0

[4065 rows x 4 columns]

Upvotes: 1

Hamza Rana
Hamza Rana

Reputation: 137

I am only going to help you find the place where the data is coming from. Parsing the JSON is up to you. If you open up the Network Tab in Chrome Developer's Console, you can see:

xhr?resort_id=5035&type=rental&active=0

Now, when you click on that, you will get the Request URL option on the right hand side. This is where the data is coming from:

https://redweek.com/whats-my-timeshare-worth/xhr?resort_id=5035&type=rental&active=0

Upvotes: 3

Related Questions