Reputation: 3344
I have two tibbles, each with up to 4 columns. Each column name will either be common to both or missing from one or the other. I need to combine these into one tibble with two rows, and NA
in columns where they are missing. I need to do this generically, so it works with more or less missing columns. Here's the code which produces the tibbles from two example web pages;
library(tidyverse)
library(htmltab)
read_results <- function(filename) {
doc <- read_file(filename)
df <- as_tibble(htmltab(doc=doc, which="//table[@id='results']"))
colnames(df) <- c("pos", "name", "time", "age_cat", "age_grade", "gender", "gender_pos", "note", "total_runs")
tib = t(as_tibble(df) %>% group_by(substr(note,1,12)) %>% summarise(number=n()))
colnames(tib) <- as.character(unlist(tib[1,]))
tib = tib[-1,]
r <- t(tib)
return (r);
}
# saved from http://www.parkrun.org.uk/henleyonthames/results/weeklyresults/?runSeqNumber=2
r2 = read_results("results _ henleyonthames parkrun_2.html")
# saved from http://www.parkrun.org.uk/henleyonthames/results/weeklyresults/?runSeqNumber=4
r4 = read_results("results _ henleyonthames parkrun_4.html")
Now t2
and t4
contain
> r2
First Timer! New PB! PB stays at <NA>
[1,] "58" "11" " 3" " 4"
> r4
First Timer! New PB! PB stays at
[1,] "62" "16" "11"
and I'd like to construct t_all
to be
First Timer! New PB! PB stays at <NA>
58 11 3 4
62 16 11 0
Upvotes: 0
Views: 3213
Reputation: 5898
Your problem is that one of the columns of r2 has a NA for name. Thus, most functions that pair matrix like objects on the basis of column names will fail. To solve it, add this line to your function: names(tib)[is.na(names(tib))] <- "Blank"
library(tidyverse)
library(htmltab)
read_results <- function(filename) {
doc <- read_file(filename)
df <- as_tibble(htmltab(doc=doc, which="//table[@id='results']"))
colnames(df) <- c("pos", "name", "time", "age_cat", "age_grade", "gender", "gender_pos", "note", "total_runs")
tib = t(as_tibble(df) %>% group_by(substr(note,1,12)) %>% summarise(number=n()))
colnames(tib) <- as.character(unlist(tib[1,]))
tib = tib[-1,]
names(tib)[is.na(names(tib))] <- "Blank" ## New Line
r <- t(tib)
return (r);
}
# saved from http://www.parkrun.org.uk/henleyonthames/results/weeklyresults/?runSeqNumber=2
r2 = read_results("results _ henleyonthames parkrun_2.html")
# saved from http://www.parkrun.org.uk/henleyonthames/results/weeklyresults/?runSeqNumber=4
r4 = read_results("results _ henleyonthames parkrun_4.html")
dplyr::bind_rows(as_data_frame(r2),as_data_frame(r4))
# A tibble: 2 × 4
`First Timer!` `New PB!` `PB stays at ` Blank
<chr> <chr> <chr> <chr>
1 58 11 3 4
2 62 16 11 <NA>
Upvotes: 3