Reputation: 1
Good evening, guys!
I am trying to get the balance sheet, income statement and cash flow from Yahoo Finance (https://finance.yahoo.com/quote/AMZN/financials).
This is the code that I already have:
# Loads the rvest, stringr, and dplyr libraries
library(rvest)
library(stringr)
library(dplyr)
# Defines the Unicorn Auctions URL (past auctions)
url <- paste0("https://finance.yahoo.com/quote/AMZN/financials")
# Scrape the auction data
BS <-
# Read HTML content from the specified URL
read_html(url) %>%
# Extract the script element containing auction data using XPath 'auction_data ='
html_nodes("script") %>%
html_text() %>%
.[48]
start = gregexpr("context",BS)[[1]][1]-2
end = nchar(BS)-12
BS <- substr(BS,start,end)
BS <- jsonlite::fromJSON(BS)
BS$context$dispatcher$stores
I followed an Youtube video, but I saw that after 2023 the Yahoo web site has changed and this code doesnt work anymore.
In the video this code would get all the balance sheet and cash flow data.
But the output that I receive is this: Code output
Could someone please help me? I know that there's some similar questions, but after 2023 the website changed and their answers doesn't work anymore.
Upvotes: 0
Views: 181
Reputation: 2243
I have been able to extract the table with the following code :
library(RDCOMClient)
url <- "https://finance.yahoo.com/quote/AMZN/financials"
IEApp <- COMCreate("InternetExplorer.Application")
IEApp[['Visible']] <- TRUE
IEApp$Navigate(url)
Sys.sleep(5)
doc <- IEApp$document()
doc$body()$innerHTML()
web_Obj_Table <- doc$getElementByID("mrt-node-Col1-1-Financials")
text_Table <- web_Obj_Table$innerText()
text_Table <- unlist(strsplit(text_Table, "\r\n"))
text_Table <- text_Table[text_Table != ""]
text_Table <- text_Table[-c(1 : 15)]
matrix(text_Table, nrow = 33, ncol = 6, byrow = TRUE)
[1,] "Total Revenue" "574,785,000" "574,785,000" "513,983,000" "469,822,000" "386,064,000"
[2,] "Cost of Revenue" "480,980,000" "480,980,000" "446,343,000" "403,507,000" "334,564,000"
[3,] "Gross Profit" "93,805,000" "93,805,000" "67,640,000" "66,315,000" "51,500,000"
[4,] "Operating Expense" "56,953,000" "56,953,000" "55,392,000" "41,436,000" "28,601,000"
[5,] "Operating Income" "36,852,000" "36,852,000" "12,248,000" "24,879,000" "22,899,000"
[6,] "Net Non Operating Interest Income Expense" "-233,000" "-233,000" "-1,378,000" "-1,361,000" "-1,092,000"
[7,] "Other Income Expense" "938,000" "938,000" "-16,806,000" "14,633,000" "2,371,000"
[8,] "Pretax Income" "37,557,000" "37,557,000" "-5,936,000" "38,151,000" "24,178,000"
[9,] "Tax Provision" "7,120,000" "7,120,000" "-3,217,000" "4,791,000" "2,863,000"
[10,] "Earnings from Equity Interest Net of Tax" "-12,000" "-12,000" "-3,000" "4,000" "16,000"
[11,] "Net Income Common Stockholders" "30,425,000" "30,425,000" "-2,722,000" "33,364,000" "21,331,000"
[12,] "Diluted NI Available to Com Stockholders" "30,425,000" "30,425,000" "-2,722,000" "33,364,000" "21,331,000"
[13,] "Basic EPS" "1.95" "2.95" "-0.27" "3.30" "2.13"
[14,] "Diluted EPS" "1.91" "2.90" "-0.27" "3.24" "2.09"
[15,] "Basic Average Shares" "10,270,000" "10,304,000" "10,189,000" "10,120,000" "10,000,000"
[16,] "Diluted Average Shares" "10,394,500" "10,492,000" "10,189,000" "10,300,000" "10,200,000"
[17,] "Total Operating Income as Reported" "36,852,000" "36,852,000" "12,248,000" "24,879,000" "22,899,000"
[18,] "Total Expenses" "537,933,000" "537,933,000" "501,735,000" "444,943,000" "363,165,000"
[19,] "Net Income from Continuing & Discontinued Operation" "30,425,000" "30,425,000" "-2,722,000" "33,364,000" "21,331,000"
[20,] "Normalized Income" "29,521,380" "29,521,380" "10,128,140" "20,551,997" "19,189,626"
[21,] "Interest Income" "2,949,000" "2,949,000" "989,000" "448,000" "555,000"
[22,] "Interest Expense" "3,182,000" "3,182,000" "2,367,000" "1,809,000" "1,647,000"
[23,] "Net Interest Income" "-233,000" "-233,000" "-1,378,000" "-1,361,000" "-1,092,000"
[24,] "EBIT" "40,739,000" "40,739,000" "-3,569,000" "39,960,000" "25,825,000"
[25,] "EBITDA" "89,402,000" "89,402,000" "38,352,000" "74,393,000" "51,076,000"
[26,] "Reconciled Cost of Revenue" "480,980,000" "480,980,000" "446,343,000" "403,507,000" "334,564,000"
[27,] "Reconciled Depreciation" "48,663,000" "48,663,000" "41,921,000" "34,433,000" "25,251,000"
[28,] "Net Income from Continuing Operation Net Minority Interest" "30,425,000" "30,425,000" "-2,722,000" "33,364,000" "21,331,000"
[29,] "Total Unusual Items Excluding Goodwill" "1,115,000" "1,115,000" "-16,266,000" "14,652,000" "2,429,000"
[30,] "Total Unusual Items" "1,115,000" "1,115,000" "-16,266,000" "14,652,000" "2,429,000"
[31,] "Normalized EBITDA" "88,287,000" "88,287,000" "54,618,000" "59,741,000" "48,647,000"
[32,] "Tax Rate for Calcs" "0" "0" "0" "0" "0"
[33,] "Tax Effect of Unusual Items" "211,380" "211,380" "-3,415,860" "1,839,997" "287,626"
Upvotes: 1