spiral01
spiral01

Reputation: 545

Phylogenetics in R: collapsing descendant tips of an internal node

I have several thousand gene trees that I am trying to ready for analysis with codeml. The tree below is a typical example. What I want to do is automate the collapsing of tips or nodes that appear to be duplicates. For instance, descendants of node 56 are tips 26, 27, 28 etc all the way to 36. Now all of these other than tip 26 appear to be duplicates. How can I collapse them all into a single tip, leaving just tips 28 and one representative of the other tips as the descendants of node 56?

I know how to manually do this one by one, but I am trying to automate the process so that a function can identify which tips need to be collapsed and then reduce them to a single representative tip. So far I have been looking at the cophenetic function which calculates the distances between the tips. However, I am not sure how to use that information to collapse tips.

Here is the newick string for the below tree:

((((11:0.00201426,12:5e-08,(9:1e-08,10:1e-08,8:1e-08)40:0.00403036)41:0.00099978,7:5e-08)42:0.01717066,(3:0.00191517,(4:0.00196859,(5:1e-08,6:1e-08)71:0.00205168)70:0.00112995)69:0.01796015)43:0.042592645,((1:0.00136179,2:0.00267375)44:0.05586907,(((13:0.00093161,14:0.00532243)47:0.01252989,((15:1e-08,16:1e-08)49:0.00123243,(17:0.00272478,(18:0.00085725,19:0.00113572)51:0.01307761)50:0.00847373)48:0.01103656)46:0.00843782,((20:0.0020268,(21:0.00099593,22:1e-08)54:0.00099081)53:0.00297097,(23:0.00200672,(25:1e-08,(36:1e-08,37:1e-08,35:1e-08,34:1e-08,33:1e-08,32:1e-08,31:1e-08,30:1e-08,29:1e-08,28:0.00099682,27:1e-08,26:1e-08)58:0.00200056,24:1e-08)56:0.00100953)55:0.00210137)52:0.01233888)45:0.01906982)73:0.003562205)38;

enter image description here

Upvotes: 3

Views: 1061

Answers (1)

C_Z_
C_Z_

Reputation: 7816

One option is to drop tips that have a length beneath the threshold.

drop_dupes <- function(tree,thres=1e-5){
  tips <- which(tree$edge[,2] %in% 1:Ntip(tree))
  toDrop <- tree$edge.length[tips] < thres
  drop.tip(tree,tree$tip.label[toDrop])
}

plot(drop_dupes(tree))

enter image description here

Upvotes: 3

Related Questions