amisos55
amisos55

Reputation: 1979

Manipulating a character vector by considering a grouping sequnce in r

I had a similar question before [here][1] but this one is slightly different.

I have and id vector ids, a grouping variable group, and a factor variable factor which has the initial numbers before _ in ids variable.

ids <- c("54_a","54_b","44_a","44_c")
 group <- c(1,2)
  factor <- c(54,44)

The rules for the output:

  1. the row that has fixed[0] should always equal to 1.
  2. When it is the first factor, the row that has fixed[1] should equal to 1. , the row that has fixed[2] should equal to 0.
  3. When it is the second factor, the row that has fixed[1] should equal to 0. , the row that has fixed[2] should equal to 1.
  4. So the number in the fixed[#] represents the factor number and when that factor is considered, this row should be equal to 1.
  5. The procedure needs to be replicated for the two groups (G1, G2)

my desired output is below:

#for the first factor first group
(G1, 54_a, fixed[0]) = 1.0; # this is always 1
(G1, 54_a, fixed[1]) = 1.0; # 1 for factor 1
(G1, 54_a, fixed[2]) = 0.0; # 0 for factor 2

(G1, 54_b, fixed[0]) = 1.0; # this is always 1
(G1, 54_b, fixed[1]) = 1.0; # 1 for factor 1
(G1, 54_b, fixed[2]) = 0.0; # 0 for factor 2


#for the second factor
(G1, 44_a, fixed[0]) = 1.0; # this is always 1
(G1, 44_a, fixed[1]) = 0.0; # 0 for factor 1
(G1, 44_a, fixed[2]) = 1.0; # 1 for factor 2

(G1, 44_c, fixed[0]) = 1.0; # this is always 1
(G1, 44_c, fixed[1]) = 0.0; # 0 for factor 1
(G1, 44_c, fixed[2]) = 1.0; # 1 for factor 2


#for the first factor second group
(G2, 54_a, fixed[0]) = 1.0; # this is always 1
(G2, 54_a, fixed[1]) = 1.0; # 1 for factor 1
(G2, 54_a, fixed[2]) = 0.0; # 0 for factor 2

(G2, 54_b, fixed[0]) = 1.0; # this is always 1
(G2, 54_b, fixed[1]) = 1.0; # 1 for factor 1
(G2, 54_b, fixed[2]) = 0.0; # 0 for factor 2


#for the second factor
(G2, 44_a, fixed[0]) = 1.0; # this is always 1
(G2, 44_a, fixed[1]) = 0.0; # 0 for factor 1
(G2, 44_a, fixed[2]) = 1.0; # 1 for factor 2

(G2, 44_c, fixed[0]) = 1.0; # this is always 1
(G2, 44_c, fixed[1]) = 0.0; # 0 for factor 1
(G2, 44_c, fixed[2]) = 1.0; # 1 for factor 2

I was able to produce the first row for each chunk of output

Fixed.Set.1 <- c()
for(g in 1:length(group)) {
  

  fixed.set.1 <- paste0(paste("(", "G",g,", ",ids, ","," fixed[0]) = 1",collapse="; ", sep=""),"; ")
  Fixed.Set.1 <- c(Fixed.Set.1, fixed.set.1)
}

> Fixed.Set.1
[1] "(G1, 54_a, fixed[0]) = 1; (G1, 54_b, fixed[0]) = 1; (G1, 44_a, fixed[0]) = 1; (G1, 44_c, fixed[0]) = 1; "
[2] "(G2, 54_a, fixed[0]) = 1; (G2, 54_b, fixed[0]) = 1; (G2, 44_a, fixed[0]) = 1; (G2, 44_c, fixed[0]) = 1; "

Any ideas on how to deal with the rest? Thanks [1]: r manipulation a character vector for a sequence

Upvotes: 0

Views: 70

Answers (1)

tamtam
tamtam

Reputation: 3671

First atempt:

library(stringr)

# define df for all ids and group combinations
group_g <- paste("G", 1:length(group), sep ="")
df <- data.frame(ids, group = rep(group_g, each = length(ids)))

# empty vector
vec <- NULL


for(i in 1:nrow(df)) {
  
  res <- which(str_extract(df[i, "ids"], "[0-9]{2,}") == factor)
  
  text <- paste("(", df[i, "group"], ", ", df[i, "ids"], ", fixed[", c(0:length(factor)) ,"]) = ", ifelse(res == 0:length(factor) | 0 == 0:length(factor), "1.0", "0.0"),";", sep = "")
  
  vec <- c(vec, text)
}

vec

Upvotes: 1

Related Questions