A. Davidsson
A. Davidsson

Reputation: 27

How to do a simple operation that normally requires double for loops in [R]?

I´m quite new to R and I´m coming from a c++ background. I have a data frame with multiple rows and columns. My question is how can I do this in a different manner because it takes for ever to run. I have over 60 thousand rows and around 15 columns. Is there a better way to do this? Help is greatly appreciated!

counter <-0

 for(j in 7:length(SeaStateData[3,]))
 {
  for( i in 1:length(SeaStateData[,3]))
      {


        if(!is.na(SeaStateData[i,j]) & !is.na(SeaStateData[i+1,j]))
        if(SeaStateData[i,j] == SeaStateData[i+1,j])
         {
          counter <- counter + 1
         }

       }
 }

Upvotes: 1

Views: 108

Answers (1)

MvG
MvG

Reputation: 60858

I'd try this:

nr <- nrow(SeaStateData)
nc <- ncol(SeaStateData)
counter <- sum(SeaStateData[1:(nr - 1), 7:nc] ==
               SeaStateData[2:nr, 7:nc],
               na.rm = TRUE)

The subsets represent two submatrices, with a relative offset of one row. The == operator will yield a logical vector (in this case a matrix, which is just a vector with added dimension information) containing TRUE if two items match, FALSE if they differ, and NA if one of them is NA. The sum over a logical vector counts all TRUE values. The na.rm attribute tells it to drop NA values; otherwise the sum would be NA as well. sum(…, na.rm = TRUE) is roughly the same as sum(na.omit(…)).

Upvotes: 5

Related Questions