Reputation: 105
I have a data frame (Dataset_Events) with seven columns, two of them being eventInfo_ea and eventInfo_el. I'd like to remove the cell value of eventInfo_el in rows where eventInfo_ea = 'add to cart'. See the code below.
Remove = function(Dataset_Events, eventInfo_ea, eventInfo_el){
if(Dataset_Events[["eventInfo_ea"]]=="add to cart"){
Dataset_Events[["eventInfo_el"]] <- NULL
}
}
sapply(Dataset_Events, Remove)
Unfortunately R gives me the following error message: "Error in Dataset_Events[["eventInfo_ea"]] : subscript out of bounds" Dimension of the dataframe are 713478 x 7. Can anybody explain why and how to fix it?
If I simply run the if condition itself, I get a proper TRUE/FALSE reply in the same length as the data.frame
Dataset_Events[["eventInfo_ea"]]=="add to cart"
Here a sample dataset of the two relevant columns (both columns have class factor):
eventInfo_ea eventInfo_el
1 click thumbnail
2 click description
3 click hero image
4 click open size dropdown
5 click hero image
6 click hero image
7 click hero image
8 click description
9 click open size dropdown
10 click hero image
11 click hero image
12 click hero image
13 click hero image
14 click description
15 click open reviews
16 click hero image
17 click open reviews
18 click description
19 add to wishlist hero image
20 click hero image
21 click hero image
22 add to cart hero image
Upvotes: 1
Views: 1292
Reputation: 105
I actually found a solution that works. I skipped the whole part of defining a function and simply used the following code and it worked
Dataset_Events[ Dataset_Events["eventInfo_ea"]=="add to cart", ]["eventInfo_el"] <- NA
Still happy to hear though why the suggestions from all of you didn't seem to modify my dataset at all. Thanks a lot though!!!
Upvotes: 1
Reputation: 2188
Try this:
Remove = function(Dataset_Events){
ind = Dataset_Events[["eventInfo_ea"]] == "add to cart"
Dataset_Events[["eventInfo_el"]][ind] = NA
return (Dataset_Events)
}
Remove(Dataset_Events)
I removed the second and third arguments from your function (you don't seem to be using them?). As you note, Dataset_Events[["eventInfo_ea"]]=="add to cart"
gives you a vector of logicals, so this should be used to index the rows you want to set to NA
(I changed from NULL since this was giving problems).
Upvotes: 2
Reputation: 593
I believe the problem is that theDataset_Events[["eventInfo_el"]]
returns a factor. In this case is better to use identical.
Remove = function(Dataset_Events, eventInfo_ea, eventInfo_el){
if(identical(as.character(Dataset_Events[["eventInfo_ea"]]),"add to cart")){
Dataset_Events[["eventInfo_el"]] <- NULL
}
}
sapply(Dataset_Events, Remove)
Upvotes: 1