Reputation: 11
I am trying to scrape data from a table on fbref however the tables contain two headers with the subheader being incorporated into the first row of data. Does anyone know how to skip the first line and use the second line as the table header so that data types can be maintained? Here is my code below.
library(rvest)
library(dplyr)
team_link = "https://fbref.com/en/squads/cff3d9bb/Chelsea-Stats-All-Competitions"
team_page = read_html(team_link)
shooting_table = team_page %>% html_nodes("#all_stats_shooting") %>%
html_table()
shooting_table = shooting_table[[1]]
Upvotes: 1
Views: 339
Reputation: 7385
You can use the janitor
package
library(janitor)
shooting_table %>%
row_to_names(1)
Which gives us:
# A tibble: 28 × 23
Player Nation Pos Age `90s` Gls Sh SoT `SoT%` `Sh/90` `SoT/90` `G/Sh` `G/SoT` Dist FK PK
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 Edouard M… sn SEN GK 29 34.0 0 0 0 "" 0.00 0.00 "" "" "" 0 0
2 Antonio R… de GER DF 28 33.7 3 48 13 "27.1" 1.42 0.39 "0.06" "0.23" "19.… 0 0
3 Thiago Si… br BRA DF 36 29.4 3 18 5 "27.8" 0.61 0.17 "0.17" "0.60" "10.… 0 0
4 Mason Mou… eng E… MF,FW 22 26.3 11 75 27 "36.0" 2.86 1.03 "0.13" "0.37" "17.… 6 1
Upvotes: 2