Reputation: 23
Question :
I want to create a dummy variable first
in R which is 1 if the value of a another dummy changed from 0 to 1 under the condition that it is not the first observation for an id number. The problem behind this is that I want to recognise firms which entered a market during the observed time period in a panel setting.
As an example I tried to create this with a small sample set:
id <- c(1,1,1,2,2,3,3,3)
dummy <- c(0,1,1,0,1,1,0,1)
df <- data.frame(id,dummy)
df[,"id"]
first.dum <- function(x)
c( x[-1,"id"] == x[,"id"]
& x[-1,"dummy"] != x[,"dummy"]
& x[,"dummy"] == "1")
df$first <- first.dum(df)
df
The result comes like ...
id dummy first
1 1 0 FALSE
2 1 1 FALSE
3 1 1 FALSE
4 2 0 FALSE
5 2 1 FALSE
6 3 1 TRUE
7 3 0 FALSE
8 3 1 FALSE
I think I did not understand how that dataframe manipulation really works.
Any help would be appreciated.
Upvotes: 2
Views: 1074
Reputation: 92302
Here's how I would approach this using data.table
package
library(data.table)
setDT(df)[, first := c(0, diff(dummy)) == 1, id][]
# id dummy first
# 1: 1 0 FALSE
# 2: 1 1 TRUE
# 3: 1 1 FALSE
# 4: 2 0 FALSE
# 5: 2 1 TRUE
# 6: 3 1 FALSE
# 7: 3 0 FALSE
# 8: 3 1 TRUE
Basically we are checking per group, if dummy
is bigger by one than the previous observation (starting from the second observation).
You can do it similarly using dplyr
library(dplyr)
df %>% group_by(id) %>% mutate(first = c(0, diff(dummy)) == 1)
Or using base R
unlist(tapply(df$dummy, df$id, function(x) c(0, diff(x)) == 1))
Upvotes: 2
Reputation: 6784
Try something like
df$first <- df$id == c(NA, df$id[-nrow(df)]) &
df$dummy > c(1, df$dummy[-nrow(df)])
to give
> df
id dummy first
1 1 0 FALSE
2 1 1 TRUE
3 1 1 FALSE
4 2 0 FALSE
5 2 1 TRUE
6 3 1 FALSE
7 3 0 FALSE
8 3 1 TRUE
If you want something like your function, consider
first.dum <- function(x) {
y <- rbind(c(NA,1),x[-nrow(x),])
x[,"id"] == y[,"id"] & x[,"dummy"] > y[,"dummy"]
}
Upvotes: 2