branch.lizard
branch.lizard

Reputation: 595

Loop or Vectorize this function for Phylogenetic Tree Clade Dropping

I have a process which I would like to repeat a known number of times, but with a catch. The first iteration should be with the original dataset, then the next should be with the result of the first, the next with the result of the second, ......

Some background: the dataset is of type phylo, so the append function inside a for loop doesn't make sense to me. Below is the actual code:

library(ape)
library(geiger)

clade.dropper <- function(phy, drop.tips) {
new.phy <- drop.tip(phy, tips(phy, drop.tips[1]))
new.phy <- drop.tip(new.phy, tips(new.phy, drop.tips[2]))
new.phy <- drop.tip(new.phy, tips(new.phy, drop.tips[3]))
new.phy <- drop.tip(new.phy, tips(new.phy, drop.tips[4]))
new.phy <- drop.tip(new.phy, tips(new.phy, drop.tips[5]))
new.phy
}

I would like to be able to prevent the hard coding of the above and somehow, loop it for a given list containing tip names of a phylogenetic tree to drop.

Thanks!

Upvotes: 0

Views: 367

Answers (2)

jeremycg
jeremycg

Reputation: 24945

There are a couple of troubles with your approach - Trees nodes are redrawn every time you subset it - if you are subsetting by node numbers, each time you remove a node, on the next trim, your next node might not be the one you wanted (I have made this painful mistake before).

If you are subsetting by names, you might be ok, but most trees won't have node names.

What we can do is make a list of every tip you want to remove at once, and then trim all at once.

Using geiger and ape:

library(geiger)
library(ape)

first load a tree:

geo <- get(data(geospiza))

the tips function isnt properly vectorized, let's fix that:

vtips <- Vectorize(tips, "node") 

Now we can drop them all at once:

todrop <- c(18,20)
drop.tip(geo$phy, unlist(vtips(geo$phy, todrop)))

For your example:

drop.tip(phy, unlist(vtips(phy, cladenum)))

Upvotes: 3

bgoldst
bgoldst

Reputation: 35324

Well, I'm not exactly familiar with all this phylogeny stuff, but I believe Reduce() is what you're looking for. Demo on the simple vector example:

A <- c(1,2,3,4,5);
Reduce(function(current,operand) current+operand,rep(1,6),A);
## [1]  7  8  9 10 11

And here's how it would work on phylogenetic data, stealing from the example code on the help page ?phylo.clades:

library(ape);
library(geiger);
sal <- get(data(caudata));
tax <- cbind(sal$tax[,c('subfamily','family','suborder')],order='Caudata');
tphy <- phylo.lookup(tax,ncores=2);
clade.num <- 1:5;
Reduce(function(phy,clade.num) drop.tip(phy,tips(phy,clade.num)),clade.num,tphy);
##
## Phylogenetic tree with 613 tips and 19 internal nodes.
##
## Tip labels:
##   Batrachoseps_attenuatus, Batrachoseps_diabolicus, Batrachoseps_gavilanensis, Batrachoseps_incognitus, Batrachoseps_luciae, Batrachoseps_major, ...
## Node labels:
##   Caudata, Salamandroidea, Plethodontidae, Bolitoglossinae, Plethodontinae, Spelerpinae, ...
##
## Unrooted; no branch lengths.

Upvotes: 0

Related Questions