Reputation: 245
I am trying to create a reusable function that calculates conversion which will applied to a data frame and return the value (or NA) based on a few conditions of other variables. This is my first attempt at creating a multi conditional calculation in a function.
It will first look at a var called parentID which is a categorical var. Only the value 377 will be calculated differently. then it will look at the values of two vars leads and clicks to check whether they have values greater than 1. If not it will return NA. Then it will decide if leads or sales was greater and make the calculation based on which was greater.
The calculation is a simple: x$sales / x$clicks or x$leads / x$clicks
set_cr <- function(x) {
if (x$parentID==377) {
if (x$leads < 1 | x$clicks < 1) {
return(NA)
}
else {
if (x$leads > x$sales) {
cr <- x$leads / x$clicks
return(cr)
}
else {
cr <- x$sales / x$clicks
return(cr)
}
}
}
else {
if (x$parentID != 377) {
if (x$sales < 1 | x$clicks < 1) {
return(NA)
}
else {
cr <- x$sales / x$clicks
return(cr)
}
}
}
return(NA)
}
I am then applying this to a data frame using:
apply(df, 1, set_cr)
I would have assumed to see the values printed in the console but this has been throwing many errors and after searching and checking multiple resources I have not been able to debug. From here I would have used this to create a x$cr var in the data frame.
A sample data set for this question:
structure(list(parentID = c(377, 377, 311, 322, 333), clicks = c(9078,
78404, 398443, 16142, 111715), sales = c(69, 95, 7191, 146, 33966
), leads = c(500, 0, 500, 0, 33966)), .Names = c("parentID", "clicks",
"sales", "leads"), row.names = c(NA, 5L), class = "data.frame")
parentID clicks sales leads
377 9078 69 500
377 78404 95 0
311 398443 7191 500
322 16142 146 0
333 111715 33966 33966
If there is a better way to share this data example please let me know and I can edit this. I recall a package but couldn't locate it in rseek or on crantastic for reusable data sets.
Thanks in advance.
Upvotes: 1
Views: 572
Reputation: 7337
Try using
x['var'] instead of x$var
Your function should work..
set_cr <- function(x) {
if (x['parentID']==377) {
if (x['leads'] < 1 || x['clicks'] < 1) {
return(NA)
}
else {
if (x['leads'] > x['sales']) {
cr <- x['leads'] / x['clicks']
return(cr)
}
else {
cr <- x['sales'] / x['clicks']
return(cr)
}
}
}
else {
if (x['parentID'] != 377) {
if (x['sales'] < 1 || x['clicks'] < 1) {
return(NA)
}
else {
cr <- x['sales'] / x['clicks']
return(cr)
}
}
}
return(NA)
}
Upvotes: 0
Reputation: 57696
apply
, when used on a data frame, turns it into a matrix. If your data frame contains character or factor variables, ther esult will be a character matrix, and your code will fail.
In this case, however, you don't need apply
. You can vectorise your code with nested ifelse
s:
set_cr <- function(x)
{
ifelse(x$parentID == 377,
ifelse(x$leads < 1 || x$clicks < 1, NA, x$leads / x$clicks),
ifelse(x$sales < 1 || x$clicks < 1, NA, x$sales / x$clicks))
}
set_cr(df)
(I assume you made a typo in the second else
code block.)
Upvotes: 2