Create column of each cluster

Question

Take a look on the example data:

> dput(data)
structure(list(mpg = c("Mazda RX4", "Mazda RX4 Wag", "Datsun 710", 
"Hornet 4 Drive", "Hornet Sportabout", "Valiant", "Duster 360", 
"Merc 240D", "Merc 230", "Merc 280", "Merc 280C", "Merc 450SE", 
"Merc 450SL", "Merc 450SLC", "Cadillac Fleetwood", "Lincoln Continental", 
"Chrysler Imperial", "Fiat 128", "Honda Civic", "Toyota Corolla", 
"Toyota Corona", "Dodge Challenger", "AMC Javelin", "Camaro Z28", 
"Pontiac Firebird", "Fiat X1-9", "Porsche 914-2", "Lotus Europa", 
"Ford Pantera L", "Ferrari Dino", "Maserati Bora", "Volvo 142E"
), cyl = c(6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 
4, 4, 4, 4, 8, 8, 8, 8, 4, 4, 4, 8, 6, 8, 4)), .Names = c("mpg", 
"cyl"), row.names = c(NA, -32L), class = "data.frame")

Let's assume that second column is the cluster number. Please remember that in my original data instead of the numbers in second column I have strings. If they are the same that means they belong to the same cluster. I would like to create columns from those clusters and put in every row the car which belongs to this cluster. Of course the name of the column should be the same as the cluster name.

akrun · Accepted Answer

We can use table, convert to data.frame, loop over the columns and use ifelse to replace the 1 value with the corresponding 'mpg' and 0 with NA.

 d1 <- as.data.frame.matrix(table(1:nrow(data),data$cyl))
 d1[] <- lapply(d1, function(x) ifelse(x!=0, as.character(data$mpg), NA))

If we are only interested in the 1 elements, instead of a 'data.frame', keep it in a list as the lengths will be different.

 lapply(d1, function(x) {as.character(data$mpg)[x!=0]})

Create column of each cluster

Answers (1)

Related Questions