Reputation: 5576
I'm using the R package reshape's melt
function, and producing a dual-bar chart (side by side) to present values for two distinct types of genetic conservation for a few dozen species.
I can order this list while "wide", e.g. arrange(species.table, desc(miR), species)
:
... species miR snoR
1 Cow 1.0000000 0.9925373
2 Sheep 1.0000000 0.9925373
3 Cat 0.9967914 1.0000000
4 Dog 0.9967914 1.0000000
5 Panda 0.9967914 1.0000000
6 White_rhinoceros 0.9967914 1.0000000
7 Alpaca 0.9775401 0.9626866
8 Guinea_Pig 0.9775401 0.9776119
9 Pika 0.9775401 0.9626866
10 Rat 0.9775401 0.9776119
11 Mouse 0.9358289 0.9701493
12 Horse 0.9294118 0.9726368
13 Pig 0.9294118 0.9726368
14 Chinese_Hamster 0.9155080 0.9527363
...
But the wide data comes out with the two conservation types on different lines, separating the species. How can I get the species 'paired' in the list, rather than e.g.:
... species variable value
1 Cat snoR 1.0000000
2 Cow miR 1.0000000
3 Dog snoR 1.0000000
4 Panda snoR 1.0000000
5 Sheep miR 1.0000000
6 White_rhinoceros snoR 1.0000000
7 Cat miR 0.9967914
8 Dog miR 0.9967914
9 Panda miR 0.9967914
10 White_rhinoceros miR 0.9967914
11 Cow snoR 0.9925373
12 Sheep snoR 0.9925373
13 Elephant snoR 0.9875622
14 Rabbit snoR 0.9875622
15 Shrew snoR 0.9875622
16 Tenrec snoR 0.9875622
17 Guinea_Pig snoR 0.9776119
18 Rat snoR 0.9776119
...
My intuition is that... I would have to melt the data row by row to achieve this and concatenate the resulting row pairs with rbind
(or some more efficient non-base R equivalent). Is there a more legitimate built-in way to do that? i.e. to make the melted data aware that I want a species-by-species list and keep the same species adjacent?
e.g. something more like:
... species variable value
1 Cow miR 1.0000000
2 Cow snoR 0.9925373
3 Dog snoR 1.0000000
4 Dog miR 0.9967914
5 Panda snoR 1.0000000
6 Panda miR 0.9967914
7 Sheep miR 1.0000000
8 Sheep snoR 0.9925373
9 White_rhinoceros miR 0.9967914
10 White_rhinoceros snoR 1.0000000
...
Upvotes: 0
Views: 40
Reputation: 24955
Starting from your wide data, I think you want to sort by the sum of the two expression values for each species:
library(dplyr)
library(tidyr)
dat %>% mutate(new = miR + snoR) %>%
gather(type, expression, -species, -new) %>%
arrange(desc(new), species, type) %>%
select(-new)
species type expression
1 Cat miR 0.9967914
2 Cat snoR 1.0000000
3 Dog miR 0.9967914
4 Dog snoR 1.0000000
5 Panda miR 0.9967914
6 Panda snoR 1.0000000
7 White_rhinoceros miR 0.9967914
8 White_rhinoceros snoR 1.0000000
9 Cow miR 1.0000000
10 Cow snoR 0.9925373
11 Sheep miR 1.0000000
12 Sheep snoR 0.9925373
13 Guinea_Pig miR 0.9775401
14 Guinea_Pig snoR 0.9776119
Upvotes: 1