Reputation: 111
I have a data frame named titanic
with 2021 rows of passengers on the titanic and specific characteristics of each passenger:
Class Sex Age Survived
1 3rd Male Child No
2 3rd Male Child No
3 3rd Male Child No
4 3rd Male Child No
5 3rd Male Child No
6 3rd Male Child No
...
I want to create a function that has multiple arguments that looks something like this:
f1 <- function(sex, age, class, survived){
...
}
where the arguments are where I input some criteria of the passengers. As an example, I want to be able to input criteria into the function such that
f1("Female", "Child","3rd", "Yes")
returns
Class Sex Age Survived
1534 3rd Female Child Yes
1535 3rd Female Child Yes
1536 3rd Female Child Yes
1537 3rd Female Child Yes
1538 3rd Female Child Yes
Now, I have hard-coded it and just used an if else statement to cover all of the possibilities.
function.q6.1 <- function(sex,age,class,survival){
if(sex == "Male" & age == "Child" & class == "3rd" & survival == "No"){
subset(titanic, Sex == "Male" & Age == "Child" & Class == "3rd" & Survived == "No")
}
else if(sex == "Female" & age == "Child" & class == "3rd" & survival == "No"){
subset(titanic, Sex == "Female" & Age == "Child" & Class == "3rd" & Survived == "No")
}
else if(sex == "Male" & age == "Adult" & class == "3rd" & survival == "No"){
subset(titanic, Sex == "Male" & Age == "Adult" & Class == "3rd" & Survived == "No")
}
...
}
I want to know if there is a more efficient way of doing this. Thank you ahead of time.
Upvotes: 0
Views: 514
Reputation: 270075
This assumes that the first argument is the data frame and the remaining arguments are values for each of the columns in the order that they appear in the data frame or else are named.
There can be fewer arguments than columns in which case for unnamed arguments the first columns of the data frame will be matched against the same number of arguments. If the arguments are named then the matches will use those names. All arguments after the data frame must either be named or not named. If only the data frame is passed with no other arguments then NULL is returned invisibly.
If there are a non-zero number of arguments after the data frame we get the names or use the first n names where n is the number of arguments after the data frame. Then remove rows with NA's from dat assuming that those rows cannot match. mapply compares successive columns to successive argument values returning a logical matrix. The apply returns one logical value per row and then we subscript by that.
We use the data frame shown reproducibly in the Note at the end in the test calls.
f1 <- function(dat, ...) {
if (n <- ...length()) {
if (is.null(nms <- ...names())) nms <- head(names(dat), n)
dat <- na.omit(dat)
dat[apply(mapply(`==`, dat[nms], list(...)), 1, all), ]
}
}
Now we run some tests
f1(dat, "3rd", "Male", "Child", "No")
## Class Sex Age Survived
## 1 3rd Male Child No
## 2 3rd Male Child No
## 3 3rd Male Child No
## 4 3rd Male Child No
## 5 3rd Male Child No
## 6 3rd Male Child No
f1(dat, "3rd", "Female", "Child", "No")
## [1] Class Sex Age Survived
## <0 rows> (or 0-length row.names)
f1(dat, "3rd")
## Class Sex Age Survived
## 1 3rd Male Child No
## 2 3rd Male Child No
## 3 3rd Male Child No
## 4 3rd Male Child No
## 5 3rd Male Child No
## 6 3rd Male Child No
f1(BOD, 1, 8.3) # BOD is built into R
## Time demand
## 1 1 8.3
f1(BOD, demand = 8.3)
## Time demand
## 1 1 8.3
Lines <- "
Class Sex Age Survived
1 3rd Male Child No
2 3rd Male Child No
3 3rd Male Child No
4 3rd Male Child No
5 3rd Male Child No
6 3rd Male Child No"
dat <- read.table(text = Lines)
Allow fewer arguments than columns and allow arguments to be named.
Upvotes: 2
Reputation: 79194
Update:
store your columns and conditions in a vector each and then apply the function to the dataframe:
library(dplyr)
library(stringr)
f1 <- paste(f1, collapse = "|")
cols <- c("Sex", "Age", "Class", "Survived")
my_function <- function(df){
df %>%
select(cols) %>%
filter(if_all(everything(), ~str_detect(.,f1))
)
}
my_function(df)
First answer:
Maybe another strategy could be:
library(dplyr)
library(stringr)
f1 <- paste(f1, collapse = "|")
my_function <- function(df){
df %>%
select(Sex, Age, Class, Survived) %>%
filter(if_all(everything(), ~str_detect(.,f1))
)
}
my_function(df)
output:
Sex Age Class Survived
1534 Female Child 3rd Yes
1535 Female Child 3rd Yes
1536 Female Child 3rd Yes
1537 Female Child 3rd Yes
1538 Female Child 3rd Yes
Upvotes: 1
Reputation: 1389
#toy dataset
set.seed(1912)
titanic <- data.frame(class = sample(c("1st","2nd","3rd"),100,replace = T),
sex = sample(c("Male","Female"),100,replace = T),
age = sample(c("Child","Adult"),100,replace = T),
survival = sample(c("Yes","No"),100,replace = T)
)
f1 <- function(sex,age,class,survival) {
titanic[titanic$class==class&titanic$sex==sex&titanic$age==age&titanic$survival==survival,]
}
f1("Female", "Child","3rd", "Yes")
class sex age survival
11 3rd Female Child Yes
15 3rd Female Child Yes
38 3rd Female Child Yes
71 3rd Female Child Yes
85 3rd Female Child Yes
94 3rd Female Child Yes
Upvotes: 1
Reputation: 16998
If you are using a data.frame like shown in your question, you could use
library(dplyr)
my_filter <- function(sex, age, class, survived) {
df %>%
filter(Sex == sex, Age == age, Class == class, Survived == survived)
}
Now my_filter("Female", "Child","3rd", "Yes")
returns
Class Sex Age Survived
7 3rd Female Child Yes
8 3rd Female Child Yes
9 3rd Female Child Yes
10 3rd Female Child Yes
11 3rd Female Child Yes
Upvotes: 1