Strip HTML Formatting from R Strings

Question

I'm trying to scrape information from this url: http://www.sports-reference.com/cbb/boxscores/index.cgi?month=2&day=3&year=2017 and have gotten decently far to the point where I have strings for each game that look like this:

str <-"Yale
			87
			
				Final
				
			
		Columbia
			78
			 
			
		"

Ideally I'd like to get to a vector or dataframe that looks something like:

str_vec <- c('Yale',87,'Columbia',78)

I've tried a few things that didn't work like:

without_n <- gsub(x = str, pattern = '
')
without_Final <- gsub(x = without_n, pattern = 'Final')
str_vec <- strslpit(x = without_Final, split = '	')

Thanks in advance for any helpful tips/answers!

Strip HTML Formatting from R Strings

Answers (1)

Related Questions