Reputation: 441
I'm looking for a function which could gives me the number of times the same string is present on a line by returning this number in a new column with this character string as name. Let's take an example:
df <- data.frame(
Year = rnorm(3),
hour = rnorm(3),
LOT = rnorm(3),
S123_AA = c('ABF4576','AG4633','AWW07954'),
S135_AA = c('ABF5403','ABF4576','A64ED56'),
S1763_BB = c('BF50343','BGF4761','B76WW56'),
S173_BB = c('BF50343','BDZ4641','B917656')
)
So, on the first line we observe twice `BF50343 and I'm looking to build new columns in order to get:
df <- data.frame(
Year = rnorm(3),
hour = rnorm(3),
LOT = rnorm(3),
S123_AA = c('ABF4576','AG4633','AWW07954'),
S135_AA = c('ABF5403','ABF4576','A64ED56'),
S1763_BB = c('BF50343','BGF4761','B76WW56'),
S173_BB = c('BF50343','BDZ4641','B917656'),
ABF4576 = c(1,1,0),
AG4633 = c(0,1,0),
AWW07954 = c(0,0,1),
ABF5403 = c(1,0,0),
A64ED56 = c(0,0,1),
BF50343 = c(2,0,0),
BGF4761 = c(0,1,0),
B76WW56 = c(0,0,1),
BDZ4641 = c(0,1,0),
B917656 = c(0,0,1)
)
If you have any idea to develop, thanks for your time
Upvotes: 0
Views: 87
Reputation: 51
You can use lapply
to loop over the unique values of your character variables:
cols <- !(colnames(df) %in% c("Year", "hour", "LOT")) ## variables of interest
vals <- as.character(unique(unlist(df[cols]))) ## unique values
res <- do.call("cbind", lapply(vals, function(x) rowSums(df[cols] == x)))
colnames(res) <- vals
df <- cbind(df, res)
Upvotes: 1