Reputation: 3450
https://www.morningstar.com/stocks/xnas/ATVI/price-fair-value
I am trying to extract the total returns data year wise
28.59 13.32 0.36 -12.34 69.68 14.13 93.25 -6.04 76.18 -25.92 3.99
from the above site. I need help regarding how to proceed next.
Option Explicit
Sub genOP()
Dim i As Long, fI As Long
Dim tickeR As String
Dim urlStr As String
Dim ie As New InternetExplorer
With INP
fI = .Range("A" & .Rows.Count).End(xlUp).Row
For i = 2 To fI
tickeR = Trim(.Range("A" & i))
Debug.Print tickeR
'https://www.morningstar.com/stocks/xnas/abmd/price-fair-value'
urlStr = "https://www.morningstar.com/stocks/xnas/" & tickeR & "/price-fair-value"
Debug.Print urlStr
With ie
.Visible = True
.navigate urlStr
While .readyState <> 4
DoEvents
Wend
Stop
.document.getElementsByClassName("annual-data-row ng-scope").innerText
End With
Next i
End With
End Sub
Upvotes: 3
Views: 743
Reputation: 84465
If you observe the web traffic you will see the page does an API xhr request for that chart info which is returned as json.
Examining the request sent we see the following:
The request url itself e.g.
https://api-global.morningstar.com/sal-service/v1/stock/priceFairValue/v2/0P00000053/data?secExchangeList=
The string 0P00000053
is a unique identifier for that ticker; I later refer to this as share_id
(placeholder) and shareId
variable. It is obtained from a request to the actual ticker page.
The query string param has no value and can be ignored. The important headers are highlighted. See key for explanation of importance.
The two red boxed headers uniquely identify a specific ticker uri.
The two blue require access keys provided in a js file.
You can make an initial request to get these access keys; an additional request to get a list of all tickers - I use the nasdaq 100 as source; or provide your own ticker list (examples given of each - comment lines out as required).
Own list use:
tickers = Array("ATVI") ''etc....extend
Nasdaq 100 list use:
tickers = GetNasdaqTickers(xhr, html)
Requests to each ticker page must be made to retrieve the unique identifiers (contentId
for "X-SAL-ContentType" header and shareId
for API url) and then headers updated accordingly:
tickerName = tickers(ticker)
url = Replace$("https://www.morningstar.com/stocks/xnas/{ticker}/price-fair-value", "{ticker}", tickerName)
headersDict("Referer") = url
Set ids = GetContentIdShareId(xhr, url, re) 'Set up correct ids for each indiv request
headersDict("X-SAL-ContentType") = ids("contentId")
The API call is updated during a loop over tickers and the chart info parsed out using a json parser. I would use jsonconverter.bas to parse the json. Install the code from that link in a standard module called JsonConverter
. All required project references are shown at top of code.
For ticker = LBound(tickers) To UBound(tickers)
'other code
nasdaqDict.Add tickerName, GetChartData(xhr, ids("shareId"), headersDict)
Next
The function GetChartData
returns a dictionary which has the chart dates as keys and chart values as values.
Each returned dictionary, for a given ticker, is added to a parent dictionary, nasdaqDict
. nasdaqDict
has the ticker names as keys and associated chart dictionaries as values.
At the end, this parent dictionary is looped and all values written out to sheet by WriteOutDict
.
You can explore the dictionary of dictionaries, nasdaqDict
, here.
VBA Code:
Option Explicit
'VBE > Tools > References:
' Microsoft HTML Object Library
' Microsoft XML ,vn.0 e.g. Microsoft XML ,v6.0
' Microsoft VBScript Regular Expressions n.n e.g. Microsoft VBScript Regular Expressions 5.5
' Microsoft Scripting Runtime
Public Sub GetNasdaq100ChartValues()
Dim re As VBScript_RegExp_55.RegExp, html As mshtml.HTMLDocument, xhr As MSXML2.XMLHTTP60
Dim nasdaqDict As Scripting.Dictionary
Set re = New VBScript_RegExp_55.RegExp
Set html = New mshtml.HTMLDocument
Set xhr = New MSXML2.XMLHTTP60
'##Set-up **************************************************************************************************
Dim headersDict As Scripting.Dictionary
Set headersDict = New Scripting.Dictionary
headersDict.Add "User-Agent", "Mozilla/5.0"
headersDict.Add "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
headersDict.Add "Referer", vbNullString
headersDict.Add "ApiKey", vbNullString
headersDict.Add "X-API-REALTIME-E", vbNullString
headersDict.Add "X-SAL-ContentType", vbNullString
Set headersDict = UpdateHeaders(xhr, re, headersDict)
Set nasdaqDict = New Scripting.Dictionary 'This will be a dictionary of dictionaries with keys as ticker names _
and values as dictionaries containing the associated chart dates as keys and values as values.
Dim ids As Scripting.Dictionary, tickerName As String, tickers(), ticker As Long, url As String
'## This gets all nasdaq tickers (from https://www.cnbc.com/nasdaq-100/) and populates tickers with these.
'## You could instead replace this with a manually supplied list of desired tickers e.g.
tickers = Array("ATVI") ''etc....extend
'tickers = GetNasdaqTickers(xhr, html) ''comment this line out if passing hardcoded ticker values
'##Get info ************************************************************************************************
For ticker = LBound(tickers) To UBound(tickers)
tickerName = tickers(ticker)
url = Replace$("https://www.morningstar.com/stocks/xnas/{ticker}/price-fair-value", "{ticker}", tickerName)
headersDict("Referer") = url
Set ids = GetContentIdShareId(xhr, url, re) 'Set up correct ids for each indiv request
headersDict("X-SAL-ContentType") = ids("contentId")
nasdaqDict.Add tickerName, GetChartData(xhr, ids("shareId"), headersDict) 'make indiv API call for current ticker
Next
WriteOutDict nasdaqDict
End Sub
Public Function UpdateHeaders(ByVal xhr As MSXML2.XMLHTTP60, ByVal re As VBScript_RegExp_55.RegExp, ByVal headersDict As Scripting.Dictionary) As Scripting.Dictionary
Dim s As String, accessKeys As VBScript_RegExp_55.MatchCollection
Dim apiKey As String, apiRealtimeKey As String
With xhr 'Make request to get keys from js file
.Open "GET", "https://www.morningstar.com/assets/quotes/1.3.0/js/sal-components-wrapper.js", False
.send
s = .responseText
End With
With re
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "apigee:""(.*?)""|realtime:""(.*?)""" 'regex pattern to return both api key values
Set accessKeys = .Execute(s)
End With
apiKey = accessKeys.item(0).SubMatches(0)
apiRealtimeKey = accessKeys.item(1).SubMatches(1)
headersDict("ApiKey") = apiKey
headersDict("X-API-REALTIME-E") = apiRealtimeKey
Set UpdateHeaders = headersDict
End Function
Public Function GetNasdaqTickers(ByVal xhr As MSXML2.XMLHTTP60, ByVal html As HTMLDocument) As Variant
Dim tickers As Object, results(), i As Long
With xhr
.Open "GET", "https://www.cnbc.com/nasdaq-100/", False
.send
html.body.innerHTML = .responseText
End With
Set tickers = html.querySelectorAll(".quoteTable a")
ReDim results(0 To tickers.Length - 1)
For i = 0 To tickers.Length - 1
results(i) = tickers.item(i).innerText
Next
GetNasdaqTickers = results
End Function
Public Function GetContentIdShareId(ByVal xhr As MSXML2.XMLHTTP60, ByVal url As String, ByVal re As VBScript_RegExp_55.RegExp) As Scripting.Dictionary
Dim ids As Scripting.Dictionary, s As String
Set ids = New Scripting.Dictionary
With xhr 'Make request to get keys from js file
.Open "GET", url, False
.send
s = .responseText
End With
With re
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "contentType:""(.*?)"",profile" 'regex pattern to get contentId
ids("contentId") = .Execute(s)(0).SubMatches(0)
.Pattern = "byId:{""(.*?)""" 'regex pattern to get shareId
ids("shareId") = .Execute(s)(0).SubMatches(0)
End With
Set GetContentIdShareId = ids
End Function
Public Function GetChartData(ByVal xhr As MSXML2.XMLHTTP60, ByVal shareId As String, ByVal headersDict As Scripting.Dictionary) As Scripting.Dictionary
Dim key As Variant, chartValues As Scripting.Dictionary, i As Long, json As Object
Set chartValues = New Scripting.Dictionary
With xhr
.Open "GET", Replace$("https://api-global.morningstar.com/sal-service/v1/stock/priceFairValue/v2/{share_id}/data", "{share_id}", shareId), False
For Each key In headersDict.keys
.setRequestHeader key, headersDict(key)
Next
.send
Set json = JsonConverter.ParseJson(.responseText)
End With
For i = 1 To json("columnDefs").Count 'collection
chartValues(json("columnDefs")(i)) = json("table")("rows")(2)("datum")(i)
Next
Set GetChartData = chartValues
End Function
Public Sub WriteOutDict(ByVal nasdaqDict As Scripting.Dictionary)
Dim key As Variant, row(), r As Long, headers()
Application.ScreenUpdating = False
headers = nasdaqDict(nasdaqDict.keys(0)).keys 'assumption that charts show for same time period for all tickers
r = 2
With ThisWorkbook.Worksheets("Sheet1")
.Cells(1, 1) = "Ticker"
.Cells(1, 2).Resize(1, UBound(headers) + 1) = headers
For Each key In nasdaqDict.keys
row = nasdaqDict(key).items
.Cells(r, 1) = key
.Cells(r, 2).Resize(1, UBound(row) + 1) = row
r = r + 1
Next
End With
Application.ScreenUpdating = True
End Sub
Sample of results:
Python:
I wrote with python initially in case of interest:
import requests, re
from bs4 import BeautifulSoup as bs
def get_chart_data(share_id):
r = s.get(f'https://api-global.morningstar.com/sal-service/v1/stock/priceFairValue/v2/{share_id}/data', headers=headers).json()
chart_values = dict(zip(r['columnDefs'], r['table']['rows'][1]['datum']))
return chart_values
headers = {
'User-Agent': 'Mozilla/5.0',
'Referer': '',
'ApiKey': '',
'X-API-REALTIME-E': '',
'X-SAL-ContentType': '',
}
p = re.compile(r'apigee:"(.*?)"|realtime:"(.*?)"')
p1 = re.compile(r'contentType:"(.*?)",profile')
p2 = re.compile(r'byId:{"(.*?)"')
with requests.Session() as s:
#set-up
###########################################################################
## This gets all nasdaq tickers and populates tickers with these.
r = s.get('https://www.cnbc.com/nasdaq-100/')
soup = bs(r.content, 'lxml')
tickers = [i.text for i in soup.select('.quoteTable a')]
## you could instead replace the above with a manually supplied list of desired tickers
# tickers = ['tickerA','tickerB']
##########################################################################
r = s.get('https://www.morningstar.com/assets/quotes/1.3.0/js/sal-components-wrapper.js')
access_keys = p.findall(r.text)
api_key = access_keys[0][0]
api_realtime_key = access_keys[1][1]
headers['ApiKey'] = api_key
headers['X-API-REALTIME-E'] = api_realtime_key
results = {}
#specific
for ticker in tickers:
url = f'https://www.morningstar.com/stocks/xnas/{ticker}/price-fair-value'
headers['Referer'] = url
r = s.get(url)
content_id = p1.findall(r.text)[0]
share_id = p2.findall(r.text)[0]
headers['X-SAL-ContentType'] = content_id
results[ticker] = get_chart_data(share_id)
Upvotes: 5
Reputation: 6654
Try:
Sub genOP()
Dim i As Long, fI As Long
Dim tickeR As String
Dim urlStr As String
Dim ie As New InternetExplorer
Dim yr As Object 'Added By Mikku
Dim j As Integer 'Added By Mikku
With INP
fI = .Range("A" & .Rows.Count).End(xlUp).row
For i = 2 To fI
tickeR = Trim(.Range("A" & i))
Debug.Print tickeR
'https://www.morningstar.com/stocks/xnas/abmd/price-fair-value'
urlStr = "https://www.morningstar.com/stocks/xnas/" & tickeR & "/price-fair-value"
Debug.Print urlStr
With ie
.Visible = True
.navigate urlStr
Do While ie.Busy
Application.Wait DateAdd("s", 1, Now)
Loop
Set yr = .document.getElementsByClassName("thead") 'Added By Mikku
Set yr = .document.getElementsByClassName("ng-binding ng-scope") 'Added By Mikku
'Debug.Print yr.Length 'Added By Mikku
For j = 1 To 11 'Added By Mikku
Debug.Print yr(j).innerText & ":" & yr(j + 11).innerText 'Added By Mikku
Next 'Added By Mikku
End With
Next i
End With
End Sub
Demo:
Upvotes: 2