Axel Fischer
Axel Fischer

Reputation: 167

ggplot: Heat map from 2D frequency histogram

my input data looks like that:

AA  36C     37T   38T   
36C 17935   3349  16843 
37T 3349    4     5690  
38T 16843   5690  11    

I would like to visualize the data in a way that I have n x n tiles and the colour of tile (0,0) would be based on the number for the contact 36C-36C (17935 in this case), tile (0,1) based on the number for the contact 36C-37T and so on. I suppose geom_tile should do the job, but I don't know how to do it.

When I read in the data I get

data = read.table("test.tbl", header = T)

> str(data)
'data.frame':   3 obs. of  4 variables:
$ AA  : Factor w/ 3 levels "36C","37T","38T": 1 2 3
$ X36C: int  17935 3349 16843
$ X37T: int  3349 4 5690
$ X38T: int  16843 5690 11

After that I don't know how to proceed and tell ggplot to plot the matrix. Thanks for any help.

Upvotes: 1

Views: 1394

Answers (1)

Axel Fischer
Axel Fischer

Reputation: 167

Sorry, should have done more research before asking.

It worked by melting the data frame.

> library(reshape2)
> melt(data)
Using AA as id variables
AA variable value
1 36C     X36C 17935
2 37T     X36C  3349
3 38T     X36C 16843
4 36C     X37T  3349
5 37T     X37T     4
6 38T     X37T  5690
7 36C     X38T 16843
8 37T     X38T  5690
9 38T     X38T    11

ggplot(data_new, aes(x = variable, y = AA)) + geom_tile(aes(fill = value)) then provides the desired result.

Upvotes: 2

Related Questions