Reputation: 343
After I collapse my rows and separate using a semicolon, I'd like to delete the semicolons at the front and back of my string. Multiple semicolons represent blanks in a cell. For example an observation may look as follows after the collapse:
;TX;PA;CA;;;;;;;
I'd like the cell to look like this:
TX;PA;CA
Here is my collapse code:
new_df <- group_by(old_df, unique_id) %>% summarize_each(funs(paste(., collapse = ';')))
If I try to gsub for semicolon it removes all of them. If if I remove the end character it just removes one of the semicolons. Any ideas on how to remove all at the beginning and end, but leaving the ones in between the observations? Thanks.
Upvotes: 6
Views: 1693
Reputation: 92282
The stringi
package allows you to specify patterns which you wish to preserve and trim everything else. If you only have letters there (though you could specify other pattern too), you could simply do
stringi::stri_trim_both(";TX;PA;CA;;;;;;;", "\\p{L}")
## [1] "TX;PA;CA"
Upvotes: 3
Reputation: 17369
use the regular expression ^;+|;+$
x <- ";TX;PA;CA;;;;;;;"
gsub("^;+|;+$", "", x)
The ^
indicates the start of the string, the +
indicates multiple matches, and $
indicates the end of the string. The |
states "OR". So, combined, it's searching for any number of ;
at the start of a string OR any number of ;
at the end of the string, and replace those with an empty space.
Upvotes: 11