Reputation: 474
I am using match.phylo.data()
in picante
to match an otu table, taxonomy table, and tree tip labels in R. I am able to save the output as a list with no errors or warnings (typically I get a warning for dropped tips if there are any) but when I use the output for diversity metrics, I only get warnings and errors that no tip names match.
library(picante)
match.phylo.otu = match.phylo.data(tree, otu)
PD <- pd(samp = match.phylo.otu$data,tree = match.phylo.otu$phy,
include.root = FALSE)
But after trying to calculate faith's PD, I get this output and a PD column as all 0's and nulls
There were 50 or more warnings (use warnings() to see the first 50)
...
50: In drop.tip.phylo(tree, treeabsent) :
drop all tips of the tree: returning NULL
I've manually re-created the tree twice, once using R sequinr
and again using plain-old unix, both outputting the problem above. I can also manually copy and paste tip labels across datasets and find corresponding information.
Here I've generated subset of each dataset. (the fact I can subset by tip labels is contrary to the problem I'm having in match.phylo.data
).
ex.otu <- otu[1:5,1:5]
ex.tax <- tax[rownames(tax) %in% rownames(ex.otu),]
library(castor)
ex.tree <- get_subtree_with_tips(tree,only_tips = rownames(ex.otu))
ex.tree <- ex.tree$subtree
> dput(ex.otu)
structure(list(NASQAN2015.147.348 = c(0L, 87L, 0L, 105L, 0L),
NASQAN2015.148.348 = c(0L, 57L, 0L, 23L, 21L), NASQAN2015.161.348 = c(17L,
77L, 0L, 146L, 0L), NASQAN2015.162.348 = c(0L, 38L, 0L, 95L,
0L), NASQAN2015.163.348 = c(0L, 39L, 0L, 7L, 0L)), row.names = c("ee866b92e722c35819b112aadc4ac885",
"afc0eeec83a181be740331928d883362", "294c12d6881a2ed67aa1557cda9889ff",
"466c2c4cb06ba39543c40a74c027008a", "fd0b270adb11f2781450c2e057e50f07"
), class = "data.frame")
> dput(ex.tax)
structure(list(Kingdom = c("d__Bacteria", "d__Bacteria", "d__Bacteria",
"d__Bacteria", "d__Bacteria"), Phylum = c("p__Chloroflexi", "p__Actinobacteriota",
"p__Actinobacteriota", "p__Planctomycetota", "p__Cyanobacteria"
), Class = c("c__Anaerolineae", "c__Actinobacteria", "c__Acidimicrobiia",
"c__Planctomycetes", "c__Cyanobacteriia"), Order = c("o__Anaerolineales",
"o__Frankiales", "o__Microtrichales", "o__Planctomycetales",
"o__Chloroplast"), Family = c("f__Anaerolineaceae", "f__Sporichthyaceae",
"f__Ilumatobacteraceae", "f__Rubinisphaeraceae", "f__Chloroplast"
), Genus = c("g__uncultured", "g__Candidatus_Planktophila", "g__CL500-29_marine_group",
"g__uncultured", "g__Chloroplast"), Species = c("s__unclassified_g__uncultured",
"s__unclassified_g__Candidatus_Planktophila", "s__bacterium_enrichment",
"s__unclassified_g__uncultured", "s__Guillardia_theta")), row.names = c("294c12d6881a2ed67aa1557cda9889ff",
"466c2c4cb06ba39543c40a74c027008a", "afc0eeec83a181be740331928d883362",
"ee866b92e722c35819b112aadc4ac885", "fd0b270adb11f2781450c2e057e50f07"
), class = "data.frame")
> dput(ex.tree)
structure(list(Nnode = 4, tip.label = c("afc0eeec83a181be740331928d883362",
"466c2c4cb06ba39543c40a74c027008a", "fd0b270adb11f2781450c2e057e50f07",
"294c12d6881a2ed67aa1557cda9889ff", "ee866b92e722c35819b112aadc4ac885"
), node.label = c("0.954", "0.678", "0.813", "0.942"), edge = structure(c(6L,
6L, 7L, 7L, 8L, 8L, 9L, 9L, 5L, 7L, 8L, 9L, 4L, 3L, 2L, 1L), dim = c(8L,
2L)), edge.length = c(0.328657635, 0.016960962, 0.053311576,
0.123171583, 0.177567591, 0.273360134, 0.177939166, 0.119027034
), root = 6, root.edge = 0.020729803), class = "phylo")
update: I can even make a "new" tree by subsetting my original tree with the rownames of otu. This "new" tree is exactly the same as my original tree, but when used in match.phylo.data() and pd() the same error occurs.
Upvotes: 1
Views: 66
Reputation: 474
Heyo!
Turns out I needed to transpose the match.phylo.otu$data object. Below is an example that'd calculate faith's pd.
riverPD <- pd(samp = t(match.phylo.otu$data),tree =match.phylo.otu$phy,
include.root = TRUE)
Upvotes: 0