Vinterwoo
Vinterwoo

Reputation: 3941

Create a new column based on values from other variables

I have data that looks like this:

A set of 10 character variables

Char<-c("A","B","C","D","E","F","G","H","I","J")

And a data frame that looks like this

Col1<-seq(1:25)
Col2<-c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,5,5,5,5)
DF<-data.frame(Col1,Col2)

What I would like to do is to add a third column to the data frame, with the logic that 1=A, 2=B, 3= C and so on. So the end result would be

Col3<-c("A","A","A","A","A","B","B","B","B","B","C","C","C","C","C","D","D","D","D","D","E","E","E","E","E")
DF<-data.frame(Col1,Col2,Col3)

For this simple example I could go with a simple substitution like this question: Create new column based on 4 values in another column

But my actual data set is much bigger with a lot more variables than this simple example, so writing out the equivalents as in the above answer is not a possibility.

So I would like to have a bit of code that can be applied to a much larger data frame. Perhaps something that looped through all the values of Col2 and matched them to the location of Char.

1=Char[1]  2=Char[2] 3=Char[3]...... for the entire length of Col2

Or any other way that could scale up to a long monstrous data frame

Upvotes: 4

Views: 3315

Answers (4)

Jared Gossett
Jared Gossett

Reputation: 81

# Values that Col2 might have taken
levels = c(1, 2, 3, 4, 5)

# Labels for the levels in same order as levels
labels = c('A', 'B', 'C', 'D', 'E')

DF$Col3 <- factor(DF$Col2, levels = levels, labels = labels)

Upvotes: 5

bramtayl
bramtayl

Reputation: 4024

Why not make a key and join?

library(dplyr)

letter_key = data_frame(letter__ID = 1:26,
                        letter = letters)

DF %>%
  rename(letter__ID = Col2) %>%
  left_join(letter_key)

This kind of thing can also be done with factors

Upvotes: 3

asshah4
asshah4

Reputation: 184

I know it may be taboo to use for loops in R, but I tried this out and it worked well.

for (i in length(DF$Col2)) {
    DF$Col3[i] <- Char[DF$Col2[i]]
}

Would that be sufficient? I think you could also unique(DF$Col2) or levels(factor(DF$Col2))

Perhaps though I'm misunderstanding your question.

Upvotes: 3

josliber
josliber

Reputation: 44299

If you wanted to use each column as an index into some vector (I'll use letters so I can index up to 25), returning a data frame of the same dimension of DF, you could use:

transformed <- as.data.frame(lapply(DF, function(x) letters[x]))
head(transformed)
#   Col1 Col2
# 1    a    a
# 2    b    a
# 3    c    a
# 4    d    a
# 5    e    a
# 6    f    b

You could then combine this with your original data frame with cbind(DF, transformed).

Upvotes: 3

Related Questions