Reputation: 505
I need to remove specific rows of my dataframe, but I have troubles doing it. The dataset looks like this:
> head(mergedmalefemale)
coupleid gender shop time amount
1 1 W 3 1 29.05
2 1 W 1 2 31.65
3 1 W 3 3 NA
4 1 W 2 4 17.75
5 1 W 3 5 -28.40
6 2 W 1 1 42.30
What I would like to do is deleting all the records of a coupleid where at least one amount is NA or negative. In the example above, all rows with coupleid "1" should be deleted as there are rows with negative values and NA's.
I tried it with functions like na.omit(mergedmalefemale)
etc. but this deletes only the rows with NA's but not other rows with the same cupleid. As I am a beginner I'd be happy if someone could help me.
Upvotes: 0
Views: 155
Reputation: 93813
Another good opportunity to apply data.table
require(data.table)
mergedmalefemale <- as.data.table(mergedmalefemale)
mergedmalefemale[, if(!any(is.na(amount) | amount < 0)) .SD, by=coupleid]
# coupleid gender shop time amount
#1: 2 W 1 1 42.3
Upvotes: 1
Reputation: 16089
Here's a fairly dirty way
# identify the coupleids that need to stay/be removed
agg <- aggregate(amount ~ coupleid, data=mergedmalefemale, FUN=function(x) min(is.na(x)|(x>0)))
# insert a column alongside "amount.y" that puts a 0 next to rows to be deleted
df.1 <- merge(mergedmalefemale, agg, by="coupleid")
# delete the rows
df.1 <- df.1[df.1$amount.y == 1, ]
Upvotes: 0
Reputation: 22293
Since you do not want to only omit the amounts that are NA or negative, but want to omit all data with the same id, you have to first find the id's you want to remove and then remove them.
mergedmalefemale <- read.table(text="
coupleid gender shop time amount
1 1 W 3 1 29.05
2 1 W 1 2 31.65
3 1 W 3 3 NA
4 1 W 2 4 17.75
5 1 W 3 5 -28.40
6 2 W 1 1 42.30",
header=TRUE)
# Find NA and negative amounts
del <- is.na(mergedmalefemale[,"amount"]) | mergedmalefemale[,"amount"]<0
# Find coupleid with NA or negative amounts
ids <- unique(mergedmalefemale[del,"coupleid"])
# Remove data with coupleid such that amount is NA or negative
mergedmalefemale[!mergedmalefemale[,"coupleid"] %in% ids,]
Upvotes: 2
Reputation: 61154
Here's one alternative. Consider your data.frame is called df
> na.omit(df[ rowSums(df[, sapply(df, is.numeric)]< 0, na.rm=TRUE) ==0, ])
coupleid gender shop time amount
1 1 W 3 1 29.05
2 1 W 1 2 31.65
4 1 W 2 4 17.75
6 2 W 1 1 42.30
Upvotes: 1