Reputation: 2744
Sorry if this title is confusing, but I can't quite figure out how to word this question, which might be why I can't seem to find the right function.
Basically, say I have this:
articles <- c(237, 278, 354, 600)
ind <- seq(1, length(articles))
DF <- data.frame(x=c(237, 237, 278, 278, 278, 354, 600, 600, 600),
y=rnorm(9))
I want to replace all of the values in DF$x with the corresponding index from the articles
vector. As in , I would like 237
to become 1
and 278
to become 2
etc.
I have built a for loop that does it, but my real data.frame is much much larger and I feel like there must be a more efficient way to accomplish this. Here is my for loop, so that you can see the end result that I want:
for (i in 1:length(articles)) {
DF[DF$x==articles[i], 1] <- ind[i]
}
I looked at the replace
function, but that doesn't seem to do it. Also, in reality, this is a data.table (from the {data.table} package), not a data.frame. I can obviously convert it to a data.frame if necessary, but if there is a more efficient way to do this within the data.table package that would be awesome.
Thanks so much. Seth
Upvotes: 1
Views: 2178
Reputation: 9618
You could try:
DF$x <- as.numeric(as.factor(DF$x))
DF
x y
1 1 0.10610802
2 1 1.71933883
3 2 0.01788855
4 2 0.83659415
5 2 0.43162867
6 3 0.68937628
7 4 -1.47557905
8 4 -0.24103146
9 4 0.14286818
Upvotes: 2
Reputation: 172
I would do:
articles <- c(237, 278, 354, 600)
DF <- data.frame(x=c(237, 237, 278, 278, 278, 354, 600, 600, 600),
y=rnorm(9))
DF$x <- match(DF$x, articles)
Because in this case ind is just the value you get with match.
Upvotes: 1