Jenny We
Jenny We

Reputation: 105

R Error 'Subscript out of bounds' in if statement - explanation and code fix?

I have a data frame (Dataset_Events) with seven columns, two of them being eventInfo_ea and eventInfo_el. I'd like to remove the cell value of eventInfo_el in rows where eventInfo_ea = 'add to cart'. See the code below.

 Remove = function(Dataset_Events, eventInfo_ea, eventInfo_el){
  if(Dataset_Events[["eventInfo_ea"]]=="add to cart"){
    Dataset_Events[["eventInfo_el"]] <- NULL
  }
 }
 sapply(Dataset_Events, Remove)

Unfortunately R gives me the following error message: "Error in Dataset_Events[["eventInfo_ea"]] : subscript out of bounds" Dimension of the dataframe are 713478 x 7. Can anybody explain why and how to fix it?

If I simply run the if condition itself, I get a proper TRUE/FALSE reply in the same length as the data.frame

Dataset_Events[["eventInfo_ea"]]=="add to cart"

Here a sample dataset of the two relevant columns (both columns have class factor):

eventInfo_ea                                           eventInfo_el
1                click                                              thumbnail
2                click                                            description
3                click                                             hero image
4                click                                     open size dropdown
5                click                                             hero image
6                click                                             hero image
7                click                                             hero image
8                click                                            description
9                click                                     open size dropdown
10               click                                             hero image
11               click                                             hero image
12               click                                             hero image
13               click                                             hero image
14               click                                            description
15               click                                           open reviews
16               click                                             hero image
17               click                                           open reviews
18               click                                            description
19     add to wishlist                                             hero image
20               click                                             hero image
21               click                                             hero image
22         add to cart                                             hero image

Upvotes: 1

Views: 1292

Answers (3)

Jenny We
Jenny We

Reputation: 105

I actually found a solution that works. I skipped the whole part of defining a function and simply used the following code and it worked

Dataset_Events[ Dataset_Events["eventInfo_ea"]=="add to cart", ]["eventInfo_el"] <- NA

Still happy to hear though why the suggestions from all of you didn't seem to modify my dataset at all. Thanks a lot though!!!

Upvotes: 1

mickey
mickey

Reputation: 2188

Try this:

Remove = function(Dataset_Events){
    ind = Dataset_Events[["eventInfo_ea"]] == "add to cart"
    Dataset_Events[["eventInfo_el"]][ind] = NA
    return (Dataset_Events)
    }
Remove(Dataset_Events)

I removed the second and third arguments from your function (you don't seem to be using them?). As you note, Dataset_Events[["eventInfo_ea"]]=="add to cart" gives you a vector of logicals, so this should be used to index the rows you want to set to NA (I changed from NULL since this was giving problems).

Upvotes: 2

Manos Papadakis
Manos Papadakis

Reputation: 593

I believe the problem is that theDataset_Events[["eventInfo_el"]] returns a factor. In this case is better to use identical.

Remove = function(Dataset_Events, eventInfo_ea, eventInfo_el){
    if(identical(as.character(Dataset_Events[["eventInfo_ea"]]),"add to cart")){
        Dataset_Events[["eventInfo_el"]] <- NULL
    }
}
sapply(Dataset_Events, Remove)

Upvotes: 1

Related Questions