Reputation: 111
I have a data.frame like this
home <- c("MANU","CHELSEA")
away <- c("SWANSEA", "LIVERPOO")
GH <- c(3,4)
GA <- c(2,1)
df <- data.frame(home, away, GH, GA)
I would like add a column in the df which fills a point column based on the result:
calc <- function(df) {
df$POINTS <- 0
for(i in 1:nrow(df))
if(df$GA[i] > df$GH[i]) {
df$POINTS[i] <- 0.11
}
else {
df$POINTS[i] <- 0.22
print("a")
}
}
This however gives me this
> df
home away GH GA POINTS
1 MANU SWANSEA 3 2 0.00
2 CHELSEA LIVERPOO 4 1 0.11
Why arent the points of the first records 0.11?
Upvotes: 1
Views: 60
Reputation:
I would strongly recommend that data.table is used, instead of data.frame. Data table is more readable, has better support for rules-based data manipulation, and is also much quicker should your datasets grow.
Here's how you could solve it:
library(data.table)
home <- c("MANU","CHELSEA")
away <- c("SWANSEA", "LIVERPOO")
GH <- c(3,1)
GA <- c(2,3)
dt <- data.table(home, away, GH, GA)
dt[, POINTS:=ifelse(GH>GA, 0.22, 0.11) ]
The first line sets up the data table:
home away GH GA
1: MANU SWANSEA 3 2
2: CHELSEA LIVERPOO 1 3
And the second adds in your ruleset:
> dt
home away GH GA POINTS
1: MANU SWANSEA 3 2 0.22
2: CHELSEA LIVERPOO 1 3 0.11
I also corrected the bug of Chelsea actually winning a soccer game. Seems unlikely these days.
Cheers
UPDATE after comment
Aha. It's basically a matter of personal preferences. As long as you can establish a clear ruleset, there are many ways to code it. Some people like compact code, I tend to prefer human readability.
Thus you could do it like this:
dt[GH>GA, comment := "home victory"]
dt[GH<GA, comment := "away victory"]
dt[GH==GA, comment := "draw"]
or like this:
dt[, home.points:=ifelse(GH>GA, 3, 0) + ifelse(GH==GA, 1, 0) + ifelse(GH<GA, 0, 0) ]
Check out any tutorial for data.table and you'll easily see how flexible it is for cases like this.
Upvotes: 2
Reputation: 3678
If you really want to use a function and a for
loop you could do this :
calc<-function(df){
for(i in 1:nrow(df)){ # brackets after the for
if(df$GA[i] > df$GH[i]) { # no need to initialize POINTS
df$POINTS[i] <- 0.11} else {
df$POINTS[i] <- 0.22
print("a")
}
}
return(df) # so that the function "returns" something
}
you can then do df<-calc(df)
and df
will have the new column with good values.
I would however recommend using ifelse
: df$POINTS<-ifelse(df$GA>df$GH,0.11,0.22)
You can of course combine multiple ifelse
statements. The first argument is the test, the second the value if the test is TRUE, the last the value if the test is FALSE.
Example of several ifelse
:
ifelse(df$home=='MANU',0.3,ifelse(df$GA>df$GH,0.11,0.22))
# [1] 0.30 0.22 # as expected
Upvotes: 0
Reputation: 887118
We don't need a loop for this
df$POINTS <- c(0.22, 0.11)[(df$GA>df$GH)+1L]
Or we can use ifelse
as well.
Upvotes: 1