Reputation: 726
I have a dataframe individual_dets
that has a few rows that I want to get rid of
area year Temp
BON-AR-S2 2016 1.853
BON-W-S5 2018 2.2
HFX 102 2018 1.2
NSTR 525 2017 2.0
NSTR 787 2017 2.3
HFX 101 2016 1.9
Boca Raton 2015 20
Shutter 2015 21
Shutter 2017 1.3
Ketch 2017 1.3
Ketch 2018 1.9
I want to keep only the rows which have strings starting with NSTR, HFX, and Boca raton rows... how do I keep just these.... or how do I get rid of the rest. I've tried using multiples of this
individual_dets$area = filter(individual_dets, area != "BON-AR-S2")
But it outputs a completely different dataframe without my original data, I've also tried
individual_dets = filter(individual_dets, area != "BON-AR-S2")
but nothing happens...
anybody know how to fix this?
Upvotes: 0
Views: 29
Reputation: 804
when searching for strings you don't use ==
, but rather %in%
if you want to return the data.frame without certain rows, you don't write individual_dets$area =
but rather df =
. The former would change a column in your table, the latter creates a new data.frame
you can use subset
(base R) instead of filter
(requires dplyr)
putting it all together:
df = subset(individual_dets, !area %in% "BON-AR-S2")
edit: as pointed out by @JBGruber, use subset(individual_dets, !grepl("BON", area))
if you want to be more general in string-finding
Upvotes: 0
Reputation: 12420
!=
and ==
only works on exact matches. If you want to match part of the string you need grepl
. You also say the lines should start with NSTR, HFX, or Boca. Start of the line can be expressed with the regex ^
. For more than one pattern you can use |
which is the regex for or:
individual_dets = filter(individual_dets, grepl("^NSTR|^HFX|^Boca", area))
Upvotes: 2