Stricken1
Stricken1

Reputation: 63

How to convert factors in a data frame to prespecified colour names in R?

I'm trying to generate heatmaps for gene expression data which also have columns representing clinical data as can be seen here: https://www.biostars.org/p/18211/

My problem is, both heatmap.plus and heatmap.3 require that the matrix specifying the colours in the heatmap actually be written as colour names. They won't take normal factors in a column and convert them to colours.

This is what my data frame "df" looks like:

sample_name     gender     chemotherapy     clinical_subtype
sample_01       M          alk              1
sample_02       F          tmz              2
sample_03       M          rad              2
sample_04       M          rad.tmz          4
sample_05       F          tmz              3

I can generate a data frame of the levels of each factor matched to a specific colour using the following code:

gender_colors <- with(df, data.frame(row.names = levels(gender), color = brewer.pal(nlevels(gender), name = 'Set1')))

Which looks like the following (with 3 colours because of missing samples):

  color
  #66C2A5
F #FC8D62
M #8DA0CB

Is there any way to use this data frame to index the original factors, replacing them with their corresponding colour so that my original frame would look something like the following?

sample_name     gender     chemotherapy     clinical_subtype
sample_01       #8DA0CB    #FC8D62          #FDAE61
sample_02       #FC8D62    #66C2A5          #FFFFBF
sample_03       #8DA0CB    #C2294A          #FFFFBF
sample_04       #8DA0CB    #FA9856          #A6D96A
sample_05       #FC8D62    #66C2A5          #1A9641

If there is an easier way to do this that doesn't require me to generate the intermediate factor level matched data frame to specific colours that would probably be a lot easier, but any help with this issue would be greatly appreciated!

      color
1     #D7191C
2     #FDAE61
3     #FFFFBF
4     #A6D96A
5     #1A9641

Upvotes: 1

Views: 1523

Answers (1)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522741

I would recommend keeping it simple and using merge() to join the various color data frames onto your main df data frame. For example, to join the gender color data frame to df you can use this:

color.gender <- data.frame(gender=c("", "F", "M"),
                           color=c("#66C2A5", "#FC8D62", "#8DA0CB"))
df <- merge(df, color.gender, by="gender")

This would leave df looking like the following:

sample_name     gender     chemotherapy     clinical_subtype     color.gender
sample_01       M          alk              1                    #8DA0CB
sample_02       F          tmz              2                    #FC8D62
sample_03       M          rad              2                    #8DA0CB
sample_04       M          rad.tmz          4                    #8DA0CB
sample_05       F          tmz              3                    #FC8D62

If you are restricted to a certain number of columns or names, you can easily rework the data frame.

Upvotes: 1

Related Questions