How to make a certain part of the row a column and change its name for multiple rows?

Question

I converted an array to a data frame and added the column names. Below, shows a sample of the data frame. I would like to make "Class: Negative" be in a column rather than a row and change "Class: Negative1" to "Negative" and "Class: Neutral1" to "Neutral" so on.

I am trying to aggregate the data, without making these changes it makes this difficult, so what can I do to make these alterations outlined above in R? (not sure how to do this)

results <- do.call(rbind.data.frame,result2)
colnames(results) = c("Sensitivity", "Specificity")
results

Current output:

                  Sensitivity Specificity
Class: Negative    0.86051081   0.8934176
Class: Neutral     0.51345486   0.8739516
Class: Positive    0.79404812   0.8982959
Class: Negative1   0.64734774   0.9644023
Class: Neutral1    0.78298611   0.6420487
Class: Positive1   0.59282436   0.9338653

I would like to achieve this as an output:

   Class       Sensitivity Specificity
   Negative    0.86051081   0.8934176
   Neutral     0.51345486   0.8739516
   Positive    0.79404812   0.8982959
   Negative    0.64734774   0.9644023
   Neutral     0.78298611   0.6420487
   Positive    0.59282436   0.9338653

Brendan A. · Accepted Answer

@RAB's comment is a neat and efficient way to get at most of the solution, but I think there are two additional steps needed, so here's an alternative:

results <- cbind(sub(".*?: (.*?)\d*$", "\1", rownames(results)), results)
names(df)[1] <- "Class"
rownames(results) <- c()

The first line creates the dataframe and performs a regex replacement on the names to get rid of "Class: " and any trailing number. I opted for sub instead of gsub since your example suggests that there is only one substitution per line, but the two should perform identically here.

The second line then replaces the name of your new column with the desired label "Class". Note that the first two lines could be combined like this: results <- cbind(data.frame(Class = sub(".*?: (.*?)\d*$", "\1", rownames(results))), results), it's just a question of style/readability.

The final line gets rid of the original rownames by replacing them with an empty vector. Doing this will clean up the output if you print the dataframe but has no effect on any further analysis.

How to make a certain part of the row a column and change its name for multiple rows?

Answers (2)

Related Questions