seansteele
seansteele

Reputation: 749

How to create a custom filter function in R

I need to filter coordinate data that is either inside or outside of a predefined area. I was hoping to write a custom function that would speed up that process. Something that could be inserted inside a pipe like this:

df %>% 
  filter(group == "A",
         outside_area(x_coord,y_coord))

I don't know if that's technically legal, but the idea is to be able to call it somewhere in a dplyr pipe

Here's the context to make things a little more clear.

# data
set.seed(123)
list <- c("A","B","C")
df <- tibble (group = sample(list, 500, replace=TRUE),
              x = runif(500,0,105), 
              y = runif(500,0,68))

# plot all the data points
df %>% ggplot(aes(x=x,y=y)) +
  geom_point()

# plot outside an area -- works
df %>% 
  filter(group == "A",
         x <= 88.5 | (x >= 88.5 & y >= 43.2) | (x >= 88.5 & y <= 24.8)) %>% 
  ggplot(aes (x=x, y=y)) +
  geom_point() +
  xlim(0,105) +
  ylim(0,69)

So the function would incorporate

x <= 88.5 | (x >= 88.5 & y >= 43.2) | (x >= 88.5 & y <= 24.8)

Thanks for your help

Upvotes: 3

Views: 1377

Answers (2)

seansteele
seansteele

Reputation: 749

I did some more research and here is a solution that can be used within dplyr pipes.

outside_area <- function(.data, x_coord, y_coord) { 
  .data %>% 
    filter({{x_coord}} <= 88.5 | ({{x_coord}} >= 88.5 & {{y_coord}} >= 43.2) | ({{x_coord}} >= 88.5 & {{y_coord}} <= 24.8))
}

df %>%
  filter(group == "A") %>%
  outside_area(x,y)

Upvotes: 0

akrun
akrun

Reputation: 887213

We could create a function as

outside_area <- function(dat, col1, col2) { 
     dat[[col1]]<= 88.5 | (dat[[col1]] >= 88.5 & dat[[col2]] >= 43.2) | (dat[[col1]] >= 88.5 & dat[[col2]] <= 24.8)
 }

df %>% 
    filter(group == "A", outside_area(., 'x', 'y'))

-output

# A tibble: 164 x 3
#   group      x      y
#   <chr>  <dbl>  <dbl>
# 1 A      74.8  16.4  
# 2 A      98.2  47.0  
# 3 A      18.2  66.1  
# 4 A       9.06 44.1  
# 5 A      29.7  62.3  
# 6 A      44.1  14.7  
# 7 A      61.7  37.3  
# 8 A      77.0   0.169
# 9 A     100.   54.4  
#10 A      17.9  53.6  
# … with 154 more rows

Upvotes: 2

Related Questions