Reputation: 51
This is my first time posting on here and I am new to R, so hopefully I am asking my question correctly. I am trying to create a new variable (Name
and Day
) in my dataset
based on two other variables. My data looks like this:
ID Name Day
1 Abbey 06-Jan-2009
2 Abbey 07-Jan-2009
3 Abbey 06-Jan-2009
4 Abbey 07-Jan-2009
5 Fred 09-Jan-2009
6 Fred 10-Jan-2009
7 Fred 09-Jan-2009
8 Fred 10-Jan-2009
And I want the new variable to look like this:
ID Name Day Time
1 Abbey 06-Jan-2009 1
2 Abbey 07-Jan-2009 2
3 Abbey 06-Jan-2009 1
4 Abbey 07-Jan-2009 2
5 Fred 09-Jan-2009 1
6 Fred 10-Jan-2009 2
7 Fred 09-Jan-2009 1
8 Fred 10-Jan-2009 2
I have tried:
dataset$Time<-as.numeric (as.factor(dataset$Name),(as.factor(dataset$Day)))
However, this doesn't restart the Time at 1 for each variable. Thanks in advance!
Upvotes: 0
Views: 73
Reputation: 887128
An option using data.table
library(data.table)#data.table_1.9.5
setDT(df1)[, id:=frank(Day, ties.method='dense'),.(Name)][]
# ID Name Day id
#1: 1 Abbey 06-Jan-2009 1
#2: 2 Abbey 07-Jan-2009 2
#3: 3 Abbey 06-Jan-2009 1
#4: 4 Abbey 07-Jan-2009 2
#5: 5 Fred 09-Jan-2009 1
#6: 6 Fred 10-Jan-2009 2
#7: 7 Fred 09-Jan-2009 1
#8: 8 Fred 10-Jan-2009 2
Upvotes: 2
Reputation: 2252
My answer is similar to MrFlick. Be careful about dates -- they can either be in date or character formats...
# Uncomment if dplyr not installed
# install.packages(dplyr)
library(dplyr)
ID <- 1:8
Name <- rep(c("Abbey", "Fred"), times=c(4,4))
Day <- c("06-Jan-2009", "07-Jan-2009","06-Jan-2009", "07-Jan-2009",
"09-Jan-2009", "10-Jan-2009","09-Jan-2009", "10-Jan-2009")
dataset <- data.frame(ID, Name, Day)
dataset <-
group_by(dataset, Name) %>%
mutate(Time = as.numeric(as.factor((as.Date(Day, "%d-%b-%Y")))))
Upvotes: 0
Reputation: 52637
Try:
transform(dataset, Time=c(ave(as.character(Day), Name, FUN=factor)))
You may or may not need the as.character
depending on whether your data starts of as character or factor. Note we use c
to drop factor attributes.
Upvotes: 2
Reputation: 206232
I think this would be fairly easy with the dplyr
package
dd<-data.frame(
ID=1:8,
Name=c("Abbey", "Abbey", "Abbey", "Abbey", "Fred", "Fred", "Fred", "Fred"),
Day=c("06-Jan-2009", "07-Jan-2009", "06-Jan-2009", "07-Jan-2009", "09-Jan-2009", "10-Jan-2009", "09-Jan-2009", "10-Jan-2009")
)
library(dplyr)
dd %>% group_by(Name) %>% mutate(id=as.numeric(factor(Day)))
# ID Name Day id
# 1 1 Abbey 06-Jan-2009 1
# 2 2 Abbey 07-Jan-2009 2
# 3 3 Abbey 06-Jan-2009 1
# 4 4 Abbey 07-Jan-2009 2
# 5 5 Fred 09-Jan-2009 1
# 6 6 Fred 10-Jan-2009 2
# 7 7 Fred 09-Jan-2009 1
# 8 8 Fred 10-Jan-2009 2
Upvotes: 0