seth127
seth127

Reputation: 2744

in R: replace values in a data frame with indices from a matching vector

Sorry if this title is confusing, but I can't quite figure out how to word this question, which might be why I can't seem to find the right function.

Basically, say I have this:

articles <- c(237, 278, 354, 600)
ind <- seq(1, length(articles))
DF <- data.frame(x=c(237, 237, 278, 278, 278, 354, 600, 600, 600),
             y=rnorm(9))

I want to replace all of the values in DF$x with the corresponding index from the articles vector. As in , I would like 237 to become 1 and 278 to become 2 etc.

I have built a for loop that does it, but my real data.frame is much much larger and I feel like there must be a more efficient way to accomplish this. Here is my for loop, so that you can see the end result that I want:

for (i in 1:length(articles)) {
  DF[DF$x==articles[i], 1] <- ind[i]
}

I looked at the replace function, but that doesn't seem to do it. Also, in reality, this is a data.table (from the {data.table} package), not a data.frame. I can obviously convert it to a data.frame if necessary, but if there is a more efficient way to do this within the data.table package that would be awesome.

Thanks so much. Seth

Upvotes: 1

Views: 2178

Answers (2)

DatamineR
DatamineR

Reputation: 9618

You could try:

DF$x <- as.numeric(as.factor(DF$x))
DF
  x           y
1 1  0.10610802
2 1  1.71933883
3 2  0.01788855
4 2  0.83659415
5 2  0.43162867
6 3  0.68937628
7 4 -1.47557905
8 4 -0.24103146
9 4  0.14286818

Upvotes: 2

questing
questing

Reputation: 172

I would do:

articles <- c(237, 278, 354, 600)
DF <- data.frame(x=c(237, 237, 278, 278, 278, 354, 600, 600, 600),
         y=rnorm(9))
DF$x <- match(DF$x, articles)

Because in this case ind is just the value you get with match.

Upvotes: 1

Related Questions