MKWalsh
MKWalsh

Reputation: 99

How to create a new column in data frame with for loop and if statements

I have a data frame with 102 rows, and I need to develop a for loop with an if statement to populate a new column "Season" based on other columns (Sp, Su, Fa, Wi). I have a "1" populating the season that the sample took place (see below).

Sp  Su  Fa  Wi
1   0   0   0
0   0   0   1

I tried just doing summer, in a loop, but I get tons of errors. I just can't seem to grasp For and if loops. ANy help would be appreciated.

for(i in 1:102) {  if(myData$Su==1) myData$Season=Summer}

Error:

In if (myData$Su == 1) myData$Season = Summer :
  the condition has length > 1 and only the first element will be used

Upvotes: 0

Views: 1165

Answers (5)

akrun
akrun

Reputation: 886938

You could also use (a variation of @Emer's approach)

 transform(dat, Season=c('Spring', 'Summer', 'Fall',
             'Winter')[as.matrix(seq_len(ncol(dat))*dat)])
 #  Sp Su Fa Wi Season
 #1  1  0  0  0 Spring
 #2  0  0  0  1 Winter

data

 dat <- structure(list(Sp = c(1, 0), Su = c(0, 0), Fa = c(0, 0), Wi = c(0, 
 1)), .Names = c("Sp", "Su", "Fa", "Wi"), row.names = c(NA, -2L
 ), class = "data.frame")

Upvotes: 0

digEmAll
digEmAll

Reputation: 57210

If you really want to use a loop you should do in this way :

# recreating an example similar to your data
myData <- read.csv(text= 
"Sp,Su,Fa,Wi
1,0,0,0
0,1,0,0
0,0,1,0
1,0,0,0
0,0,0,1")

# before the loop, add a new "Season" column to myData filled with NAs
myData$Season <- NA

# don't use 102 but nrow(myData) so
# in case myData changes you don't have to modify the code
for(i in 1:nrow(myData)){

  # here you are working row-by-row
  # so note the [i] indexing below

  if(myData$Sp[i] == 1){
    myData$Season[i] = "Spring"
  }else if(myData$Su[i] == 1){
    myData$Season[i] = "Summer"
  }else if(myData$Fa[i] == 1){
    myData$Season[i] = "Fall"
  }else if(myData$Wi[i] == 1){
    myData$Season[i] = "Winter"
  }
}

But actually (as shown in the other answers) there are more efficient and faster ways.

Upvotes: 0

Andrie
Andrie

Reputation: 179398

Since R is a vector-based language, you don't need a for loop in this case.

dat <- data.frame(
  Sp = c(1, 0),
  Su = c(0, 0),
  Fa = c(0, 0),
  Wi = c(0, 1)
)

A naive, brute force way would be to use nested ifelse() functions:

dat$Season <- with(dat, 
                   ifelse(Sp == 1, "Spring", 
                          ifelse(Su == 1, "Summer", 
                                 ifelse(Fa == 1, "Fall", 
                                        "Winter"))))
dat

  Sp Su Fa Wi Season
1  1  0  0  0 Spring
2  0  0  0  1 Winter

But the R way of doing this would be to think about the structure of your data, then use indexing, for example:

dat$season <- apply(dat, 1, function(x) c("Sp", "Su", "Fa", "Wi")[x==1])

  Sp Su Fa Wi season
1  1  0  0  0     Sp
2  0  0  0  1     Wi

Upvotes: 1

Emer
Emer

Reputation: 3824

Try to identify which column has an 1, then use this index to return the desidered name of the Season from a char vector:

data <- c("Sp  Su  Fa  Wi
           1   0   0   0
           0   0   0   1")
data <- read.table(text=data,header=TRUE)

data$Season <- c("Spring","Summer","Fall","Winter")[which(data==1,arr.ind=TRUE)[,"col"]]

Result:

  Sp Su Fa Wi Season
1  1  0  0  0 Spring
2  0  0  0  1 Winter

Upvotes: 3

Alexey Ferapontov
Alexey Ferapontov

Reputation: 5169

ifelse(myData$Su==1, myData$Season=="Summer",myData$Season=="Not Summer")

or a more complicated "no" statement (e.g. nested ifelse - if Wi ==1, set to Winter, etc)

Upvotes: 0

Related Questions