Reputation: 23
From a dataframe like this
DF <- read.table(text = "String Found Count
0-025823 0 1
1-042055 1 1
1-018396 1 2
1-018396 1 2
1-002984 1 3
1-002984 1 3
1-002984 1 3", header = TRUE)
I would want to get the following output:
String Found Count Desired output
0-025823 0 1 1
1-042055 1 1 1
1-018396 1 2 1
1-018396 1 2 0
1-002984 1 3 1
1-002984 1 3 0
1-002984 1 3 0
The Desired output
column is showing the unique value starting from top to bottom. The very first unique value found will be tagged as 1 and rest (duplicates) will be all 0s.
I have used the following formula in excel to get the output in excel:
=IF(COUNTIF($A$2:A2,A2)>1,0,1)
where the sequenceof columns is same as above.
I have used loops, aggregate
and within
functions but these did not give the desired result.
Upvotes: 1
Views: 74
Reputation: 28825
Roland's solution is faster than using dplyr
but for the sake of presenting another solution:
library(dplyr)
DF %>% group_by(String) %>% mutate(unique = ifelse(row_number()==1,1,0))
# # A tibble: 7 x 4
# # Groups: String [4]
# String Found Count unique
# <fctr> <int> <int> <dbl>
# 1 0-025823 0 1 1
# 2 1-042055 1 1 1
# 3 1-018396 1 2 1
# 4 1-018396 1 2 0
# 5 1-002984 1 3 1
# 6 1-002984 1 3 0
# 7 1-002984 1 3 0
Upvotes: 1
Reputation: 41
Suppose your data frame is df
df[,"Desired output"]=0
for(i in (1:nrow(df)))
{
if(length(which(df[1:i,]$Count==df[i,"Count"]))==1)
{ df[i,"Desired output"]=1
}
else
{
df[i,"Desired output"]=0
}
}
Upvotes: 1
Reputation: 132676
You want to mark duplicated values as 0:
DF <- read.table(text = "String Found Count
0-025823 0 1
1-042055 1 1
1-018396 1 2
1-018396 1 2
1-002984 1 3
1-002984 1 3
1-002984 1 3", header = TRUE)
DF$unique <- 1 - duplicated(DF$String)
# String Found Count unique
#1 0-025823 0 1 1
#2 1-042055 1 1 1
#3 1-018396 1 2 1
#4 1-018396 1 2 0
#5 1-002984 1 3 1
#6 1-002984 1 3 0
#7 1-002984 1 3 0
duplicated
returns logical values and I use that TRUE
/FALSE
is coerced to 1
/0
when used in arithmetic.
Note that usually you should not coerce to integers. You could simply do !duplicated(DF$String)
instead.
Upvotes: 7