Annie Wilt
Annie Wilt

Reputation: 1

Missing categorical annotations in R pheatmap() despite no missing values in data

I am trying to make a cluster emphasized text*d heatmap with pheatmap in R for cytokine values. I also want to add in annotations based on a categorical variable with 4 different options. Despite ensuring there is no missing data the annotations have multiple rows with no colored annotation. As seen in the picture

enter image description here

My original data set had 380 samples, but I cleaned the data to only include rows that have no missing values, which brought me down to 154 samples with 41 variables. Here are the names of the variables-

[1] "tbm_category" "tnfa"         "il6"          "il33"         "il3"          "il8"          "il7"          "ip10"         "il10"  
[10] "egf"          "vegf"         "grob"         "il1b"         "ifng"         "il1ra"        "mip3a"        "il12"         "mip1a"  
[19] "il31"         "mip1b"        "il1a"         "il4"          "mip3b"        "il2"          "groa"         "fractalkine"  "fgfbasic"  
[28] "eotaxin"      "il15"         "il5"          "gcsf"         "pdgfaa"       "mcp1"         "ifna"         "il21"         "trail"  
[37] "tnfsf5"       "il23"         "flt3ligand"   "il18"         "granzymeb"`

The tbm_category is a categorical variable with 4 different groups and these are the options and their counts:

Definite TBM Not TBM Possible TBM Probable TBM
26 57 52 19

The data is called cleaned_cyto_ordered and I ordered it based on tbm_category.

I also grouped the cytokines together for ease of coding:

numeric_variables <- names(cleaned_cyto_ordered)[names(cleaned_cyto_ordered) != "tbm_category"]

Here is my check to ensure no missing values

if (any_missing) {print("There are missing values in the dataset.")} else {print("There are no missing values in the dataset.")}
[1] "There are no missing values in the dataset."

My goal is to cluster the cytokine data and then add the colored annotations based on tbm_category. The code I used to create the heat map I attached (I log10 transformed the cytokine values as they tend to be extremely small and to get any meaningful analysis you have to log10 transform):

pheatmap(log10(cleaned_cyto_ordered[, numeric_variables]),
         annotation_row = data.frame(Category= cleaned_cyto_ordered$tbm_category),
         scale = "row",
         cluster_rows = FALSE,
         show_rownames = FALSE, clustering_distance_cols = "correlation")

I have been playing around with this all day and I have given up hope to fixing the missing categorical annotations. Any help is greatly appreciated.

Upvotes: 0

Views: 63

Answers (0)

Related Questions