ccamara
ccamara

Reputation: 1225

Split multiple values from a single variable within a data frame

I have the following dataframe which contains several values for a single variable (Problemas.habituales) (see below)

> read.csv("http://pastebin.com/raw.php?i=gnWRqJnY")
  Nombre.barrio                             Problemas.habituales
1         Actur Robos con violencia, Agresiones, Otros problemas
2         Actur                                  Ningún problema
3        Centro                  Robos con violencia, Agresiones
4     San Pablo                                  Ningún problema
5     San Pablo                                  Ningún problema
6      Delicias                     Hurtos o robos sin violencia

The reason for this structure is that I created an online questionnaire which accepts multiple answers to the same question, but the way data is stored is a problem because there's no way to create a barplot displaying all common problems within every neighborhood without previously manipulating the dataframe.

Unfortunately I do not know how to manipulate the dataframe (I need it to be on a data frame since I need to use ggplot2 later on, which does not accept data tables) in a way that every row contains a single value for the variable "Problemas.habituales".

Upvotes: 0

Views: 142

Answers (2)

Veerendra Gadekar
Veerendra Gadekar

Reputation: 4472

you can do this using splitstackshape

library(splitstackshape)
cSplit(DF, "Problemas habituales", ",", direction = "long")

#   Nombre barrio         Problemas habituales
#1:         Actur          Robos con violencia
#2:         Actur                   Agresiones
#3:         Actur              Otros problemas
#4:         Actur              Ningún problema
#5:        Centro          Robos con violencia
#6:        Centro                   Agresiones
#7:     San Pablo              Ningún problema
#8:     San Pablo              Ningún problema
#9:      Delicias Hurtos o robos sin violencia

Upvotes: 2

Roland
Roland

Reputation: 132706

library(data.table)
DF <- fread("http://pastebin.com/raw.php?i=gnWRqJnY")
setnames(DF, make.names(names(DF)))
DF <- DF[, .(Problemas.habituales = unlist(strsplit(Problemas.habituales, ",", 
                                                    fixed = TRUE))), by = Nombre.barrio]
setDF(DF)

(I assume that you don't see encoding problems with your locale.)

Upvotes: 3

Related Questions