Reputation: 1
I tried running poppr:amova with dist= NULL (Euclidean distance calculated by the function itself) and either the pegas implementation (see code below). My dataset contains 3000 SNPs and my hierarchical structure consists of 2 clusters, and populations within each clusters. So I have 3 levels: variance among clusters, variance among populations within clusters and variance among individuals (within populations or in general across the whole sample??).
Eucl_popprAMOVAres = poppr::poppr.amova(
myGenlight,
hier = ~clusters/pop,
clonecorrect = FALSE,
within = FALSE,
**dist = NULL**,
**squared = FALSE**, # false, otherwise it gets sqrooted! is this correct if using Euclidean dist?
freq = TRUE,
correction = NULL,
sep = "_",
filter = FALSE,
threshold = 0,
algorithm = "farthest_neighbor",
threads = 1L,
missing = "loci",
cutoff = 0.05,
quiet = FALSE,
method = c("pegas"),
nperm = 100000)
Results: Warning message: In poppr::poppr.amova(myGenlgiht, : Missing data are not filtered from genlight data
Analysis of Molecular Variance
Call: pegas::amova(formula = hier, data = hierdf, nperm = nperm, is.squared = FALSE)
SSD MSD df
clusters 1500.402 1500.4016 1
pop 3329.225 832.3063 4
Error 142431.371 847.8058 168
Total 147260.997 851.2196 173
Variance components:
sigma2 P.value
clusters 8.63952 0.0000
pop -0.53478 0.8762
Error 847.80578
Phi-statistics:
clusters.in.GLOBAL (Phi_CT) pop.in.GLOBAL (Phi_ST) pop.in.clusters (Phi_SC)
0.0100939536 0.0094691422 -0.0006311825
Variance coefficients:
a b c
28.98276 29.03448 77.33333
I would like to know the following:
is it correct to set "is.squared = FALSE" if I am letting the function calculate the genetic distance with dist() - which computes the Euclidean distance?
why does the function tell me "warning: missing data are not filtered" even if I set the argument "missing = "loci" ", so that it should remove all loci for which there are missing information (only in each pairwise distances i think?)
is it correct that the sigma^2 for clusters = variation among clusters & sigma^2 for pop = variation among populations BUT within clusters?
what does the **Error **correspond to in my case? would that be the remaining variation - so variation among individuals BUT within populations or within the all sample (across all populations)?
If I use ade4 implementation in poppr::amova, I also obtain the % variance explained by each hierarchical level. How do I obtain that ifrom the pegas results? Is it correct that to obtain the % variance for each hierarchical level --> i sum all sigma^2 values (cluster+pop+error) and then divide each by the sum? i.e. % variance explained by clusters = sigma^2 for clusters/sum sigmas100 and % variance due to variation among individuals = sigma^2 for error/sum sigmas100 ??
What do the coefficients a, b, c mean? I understand that they represent the slope/effect of each treatment (in this case group/population/individuals) on the genotype...but how would you interpret them along with the variances?
Finally, I get opposite results (variation among populations more significant then among clusters) if I use ade4 (with same n permutation in randtest) --> any reason why? and which result to trust?
I would really appreciate some feedback and help interpreting these results!
Thanks!
Gabriella
I tried running popprr:amova
with either the pegas and the ade4 implementation and 3 hierarchical levels (clusters, populations, individuals). I set within individuals = false when using pegas because I understand that there is a bug in the within individual computation of variation for this package.
my ade4 code and results are below for comparison:
Eucl_popprAMOVAres_ade4 = poppr::poppr.amova(
myGenlight,
hier = ~clusters/pop,
clonecorrect = FALSE,
within = FALSE,
dist = NULL,
squared = FALSE, # false, otherwise it gets sqrooted!
freq = TRUE,
correction = NULL,
sep = "_",
filter = FALSE,
threshold = 0,
algorithm = "farthest_neighbor",
threads = 1L,
missing = "loci",
cutoff = 0.05,
quiet = FALSE,
method = c("ade4"),
nperm = 100000)
Warning message:
In poppr::poppr.amova(myGenlight, :
Missing data are not filtered from genlight data
$call
ade4::amova(samples = xtab, distances = xdist, structures = xstruct)
$results
Df Sum Sq Mean Sq
Between clusters 1 984.7106 984.7106
Between samples Within clusters 4 3844.9162 961.2291
Within samples 168 142431.3705 847.8058
Total 173 147260.9974 851.2196
$componentsofcovariance
Sigma %
Variations Between clusters 0.2675142 0.03139829
Variations Between samples(*pops) Within clusters 3.9289742 0.46114598
Variations Within samples (*pops) 847.8057771 99.50745573
Total variations 852.0022654 100.00000000
$statphi
Phi
Phi-samples-total 0.0049254427
Phi-samples-clusters 0.0046129082
Phi-clusters-total 0.0003139829
Eucl_popprAMOVAres_ade4_amova.test = randtest( Eucl_popprAMOVAres_ade4, nrepet = 100000)
Eucl_popprAMOVAres_ade4_amova.test
class: krandtest lightkrandtest Monte-Carlo tests Call: randtest.amova(xtest = Scallop_Rep98_Loc95_Eucl_popprAMOVAres_ade4, nrepet = 1e+05)
Number of tests: 3
Adjustment method for multiple comparisons: none Permutation number: 100000 Test Obs Std.Obs Alter Pvalue 1 Variations within samples 847.8057771 -9.8469609 less 9.99990e-06 2 Variations between samples 3.9289742 8.3893135 greater 9.99990e-06 3 Variations between clusters 0.2675142 0.1030489 greater 4.00706e-01
In ade4, the p-value of cluster is much larger than for pop, and pop is very significant. While in pegas: cluster is significant populations within clusters is not.
Upvotes: 0
Views: 57