njmcd
njmcd

Reputation: 71

Remove whitespace from a database in R?

I am trying to clean a data frame that I have scraped from the web by removing all of the null values. However, all of the "null" values are actually whitespace values like this " ". Here is my code:

url1 <- 'https://www.pro-football-reference.com/draft/2019-combine.htm'

browseURL(url1)

get_pfr_HTML_file1 <- GET(url1)

combine.parsed <- htmlParse(get_pfr_HTML_file1)

page.tables1 <- readHTMLTable(combine.parsed, stringsAsFactors = FALSE)

data2019 <- data.frame(page.tables1[1]) 

Please let me know how I could clean data2019.

Upvotes: 1

Views: 62

Answers (1)

akrun
akrun

Reputation: 887571

With base R, can use rowSums on a logical matrix to create a logical vector to select rows that have no blank ("") as row index

data2019[!rowSums(data2019 == "") > 0,]

data2019 == "" # // returns a logical matrix
rowSums(data2019 == "") # // get the rowwise count of blank elements
rowSums(data2019 == "") > 0 # // convert the count to logical vector
!rowSums(data2019 == "") > 0 # // negate so that it would be 
             # // TRUE when all values in a row are non-blank

Upvotes: 2

Related Questions