Reputation: 760
My data looks something like this:
13 EDHEC Business School
14 Columbia U and IZA
15 Yale U and Abdul Latif Jameel Poverty Action Lab
16 Carnegie Mellon U
17 Columbia U
As you can see some of the entries contain "multiple" entities, I don't want that. Since the separate_rows function can't handle delimiters consisting of multiple signs (or so I gather) I plan to use the gsub-function to turn all instances of "and" to the letter "ö" (this letter is unlikely to appear naturally in the material). I will then be able to use "ö" as a separator in the separate_rows function.
I start by typing:
distinctAF <- gsub("and", "ö", distinctAF)
This seems to work, but it has turned my data frame into a character vector. I try to change it back via the as.data.frame-function but to no avail:
distinctAF <- as.data.frame(distinctAF)
distinctAF
1 c("MIT", "NBER", "U MI", "Cornell U", "U VA", "Harvard....
I've tried transforming the vector to a matrix as a first step, but this doesn't seem to work either:
distinctAF <- matrix(distinctAF, ncol = 1, byrow = TRUE)
I've also tried to cbind the character vector with a numerical vector with the same length, in the hope of producing a matrix. Strangely, this creates a matrix with one copy of the character vector per number in the numeric vector.
How do I turn my character vector back into a data frame (with one value per row) so that I can separate my rows as intended?
I feel like I've tried everything, this shouldn't be that hard ^^
link to file:
https://www.dropbox.com/s/d4z58w6xvmkyepy/affiliations.csv?dl=0
Upvotes: 0
Views: 332
Reputation: 370
Maybe using stringr
can help.
require(data.table) # I prefer data.table to data.frame
require(stringr) # Used for string ops
# Read the data
data <- fread("affiliations.csv", skip = 1)
colnames(data) <- c("id", "aff")
# Replace `and`s with `ö`s
data[, mod_aff := str_replace_all(aff, " and ", " ö ")]
# Check if worked
head(data[str_detect(mod_aff, "ö")])
# id aff mod_aff
# 1: 14 Columbia U and IZA Columbia U ö IZA
# 2: 15 Yale U and Abdul Latif Jameel Poverty Action Lab Yale U ö Abdul Latif Jameel Poverty Action Lab
# 3: 21 ETH Zurich and CESifo ETH Zurich ö CESifo
# 4: 22 U Copenhagen and CESifo U Copenhagen ö CESifo
# 5: 26 U Chicago and IZA U Chicago ö IZA
# 6: 28 Bocconi U and IGIER Bocconi U ö IGIER
Upvotes: 0