Reputation: 71
I am trying to clean a data frame that I have scraped from the web by removing all of the null values. However, all of the "null" values are actually whitespace values like this " ". Here is my code:
url1 <- 'https://www.pro-football-reference.com/draft/2019-combine.htm'
browseURL(url1)
get_pfr_HTML_file1 <- GET(url1)
combine.parsed <- htmlParse(get_pfr_HTML_file1)
page.tables1 <- readHTMLTable(combine.parsed, stringsAsFactors = FALSE)
data2019 <- data.frame(page.tables1[1])
Please let me know how I could clean data2019.
Upvotes: 1
Views: 62
Reputation: 887571
With base R
, can use rowSums
on a logical matrix
to create a logical vector to select rows that have no blank (""
) as row index
data2019[!rowSums(data2019 == "") > 0,]
data2019 == "" # // returns a logical matrix
rowSums(data2019 == "") # // get the rowwise count of blank elements
rowSums(data2019 == "") > 0 # // convert the count to logical vector
!rowSums(data2019 == "") > 0 # // negate so that it would be
# // TRUE when all values in a row are non-blank
Upvotes: 2