Reputation: 150
I made data scraping like table below, but I can't find solution to clean up this table vith GSUB. Namely I tried code like :
populous_table$Tax_GDP <- gsub("[:punct:]","",populous_table$Tax_GDP )
but this code can't clean brackets []
for number 7 Australia.
Can anyone help me ?
1 Afghanistan 6.4
2 Albania 22.9
3 Algeria 7.7
4 Angola 5.7
5 Argentina 37.2
6 Armenia 22.0
7 Australia 34.3 [2]
8 Austria 43.4
Upvotes: 1
Views: 50
Reputation: 626689
You may use
populous_table$Tax_GDP <- gsub("\\s*\\[\\d+]","", populous_table$Tax_GDP )
Or, if that [digits]
substring is always at the end, add $
:
populous_table$Tax_GDP <- gsub("\\s*\\[\\d+]$", "", populous_table$Tax_GDP )
The \s*\[\d+]
pattern means
\s*
- 0+ whitespaces\[
- a [
char\d+
- 1+ digits]
- a ]
char.See R demo:
x <- c("1 Afghanistan 6.4", "2 Albania 22.9", "3 Algeria 7.7", "4 Angola 5.7", "5 Argentina 37.2", "Armenia 22.0", "7 Australia 34.3 [2]", "8 Austria 43.4")
gsub("\\s*\\[\\d+]", "", x)
## => [1] "1 Afghanistan 6.4" "2 Albania 22.9" "3 Algeria 7.7"
[4] "4 Angola 5.7" "5 Argentina 37.2" "Armenia 22.0"
[7] "7 Australia 34.3" "8 Austria 43.4"
Upvotes: 2