Reputation: 13
I'm quite new to R and am currently stuck with my data.frame. I have a character column with different groups varying in numbers. For example the first seven rows being "A", the next five rows being "B" and so on. Now I have a vector with a length being equal to the total number of groups. My goal is to create a new column, where all "A" rows get the first vector value, all "B" rows the second value and so on.
I already tried:
values <- c("G", "H", "J", "K")
dat$col2 <- values[dat$col1]
from an earlier entry (Create new column based on 4 values in another column) and it worked. But after updating R it somehow doesn't work anymore. Though it creates the new column "col2", the values are now all NA and not corresponding the vector.
Can anyone help me out with that?
edit: example as reproducible code:
first_column <- c(rep("value_1", 6),rep("value_2",7))
df <- data.frame(first_column)
df$second_column <- c("A","B")[df$first_column]
Upvotes: 1
Views: 716
Reputation: 389325
You have character values in first_column
. You cannot use character value to index the vector here. Use match
to create the index.
df$second_column <- c("A","B")[match(df$first_column, unique(df$first_column))]
df
# first_column second_column
#1 value_1 A
#2 value_1 A
#3 value_1 A
#4 value_1 A
#5 value_1 A
#6 value_1 A
#7 value_2 B
#8 value_2 B
#9 value_2 B
#10 value_2 B
#11 value_2 B
#12 value_2 B
#13 value_2 B
Upvotes: 0
Reputation: 1159
I think that you are simply looking for an ifelse
.
group.sizes <- c(10, 20, 30 , 40)
names(group.sizes) <- c("G", "H", "J", "K")
df$new.column <- ifelse(df$column == "G",
group.sizes["G"],
ifelse(df$column == "H",
group.sizes["H"],
ifelse(df$column == "J",
group.sizes["J"],
ifelse(df$column == "K",
group.sizes["K"],
NA)))
Upvotes: 1