Excerpt from the genepop documentation section 6, author Francis Rousset.
This section is only intended as a quick reference guide. The primary literature should be consulted for further information about the methods implemented in Genepop.
When apparent null homozygotes are observed, one may wonder whether these are truly null homozygotes, or whether some technical failure independent of genotype
has occurred. In both cases, maximum likelihood estimates can be obtained by the EM algorithm (Dempster et al., 1977; Hartl & Clark, 1989; Kalinowski & Taper, 2006).
Broofield (1996) considered simpler estimators for the case where apparent null homozygotes are true null homozygotes, which he also described as maximum likelihood estimators. However, there are some (often small) differences with the ML estimates derived by the EM algorithm as implemented in this and previous versions of Genepop, which may to be due to the fact that Brookfield wrote a likelihood formula for the number of apparent homozygotes and heterozygotes, while the EM implementation is based on a likelihood formula where apparent homozygotes and heterozygotes for different alleles are distinguished. For the case where one is unsure whether apparent null homozygotes are true null homozygotes, Chakraborty et al. (1992) described a method to estimate the null allele frequency from the other data, excluding any apparent null homozygote. Beyond its relatively low efficiency, the behavior of this estimator is sometimes puzzling (for example, where there is no obvious heterozygote in a sample, the estimated null allele frequency is always 1, whatever the number of alleles obviously present and even if only non-null genotypes are present). Actually, even if apparent null homozygotes are not true null homozygotes, their number bring some information, and it is more logical to estimate the null allele frequency jointly with the nonspecifc genotyping failure rate by maximum likelihood (Kalinowski & Taper, 2006). This analysis is possible when at least three alleles are obviously present in the sample, again by the EM algorithm, and is implemented in Genepop. In conclusion, the different methods mentioned above are available through menu option 8.1, except the one of Chakraborty et al.
The probability of a sample of genotypes depends on allele frequencies at one or more loci. In the tests of Hardy Weinberg equilibrium, population differentiation and pairwise independence between loci ('linkage equilibrium') implemented in Genepop, one is not interested in the allele frequencies themselves and, given they are unknown, the aim is to derive valid conclusions whatever their values. In these different cases, this can be achieved by considering only the probability of samples conditional on observed allelic (e.g. for HW tests) or genotypic counts (e.g. for tests of population differentiation not assuming HW equilibrium). Because exact probabilities are computed, these conditional tests are also known as exact tests.See Cox & Hinkley (1974) and Lehmann (1994) for the underlying theory; a much more elementary introduction to the tests implemented in Genepop is Rousset & Raymond (1997).
The Mantel test is one of the exact tests implemented in Genepop, but partial Mantel tests are not implemented. The latter have been used to test for effects of a variable Y on a response variable Z, while removing spatial autocorrelation effects on Z. Both standard theory of exact tests and simulation show that the permutation procedure of the Mantel test is not appropriate for the partial Mantel test when the Y variable itself presents spatial gradients (Oden & Sokal, 1992; Raufaste & Rousset, 2001; Rousset, 2002b). Asymptotic arguments have also been proposed to support the use of permutation tests (e.g. Anderson, 2001) but they fail in the same conditions.
Conditional tests require in principle the complete enumeration of all possible samples satisfying the given condition. In many cases this is not practical, and the P-value may be computed by simple permutation algorithms or by more elaborate Markov chain algorithms, in particular the Metropolis-Hastings algorithm (Hastings, 1970). The latter algorithm explores the universe of samples satisfying the given condition in a 'random walk' fashion. For HW testing Guo & Thompson (1992) found a Metropolis-Hastings algorithm to be effcient compared to permutations. A slight modification of their algorithm is implemented in Genepop. Guo and Thompson also considered tests for contingency tables (Technical report No. 187, Department of Statistics, University of Washington, Seattle, USA, 1989) and again a slightly modified algorithm is implemented in Genepop (Raymond & Rousset, 1995a). A run of the Markov chain (MC) algorithms starts with a dememorization step; if this step is long enough, the state of the chain at the end of the dememorization is independent of the initial state. Then, further simulation of the MC is divided in batches. In each batch a P-value estimate is derived by counting the proportion of time the MC spends visiting sample configurations more extreme (according to the given test statistic) than the observed sample. If the batches are long enough, the P-value estimates from successive batches are essentially independent from each other and a standard error for the P-value can be derived from the variance of per-batch P-values (Hastings, 1970). As could be expected, the longer the runs, the lower the standard error.
For most data sets the MC 'mixes well' so that the default values of the dememorization length and batch length implemented in Genepop appear quite sufficient (in many other applications of MC algorithms, things are not so simple; e.g. Brooks & Gelman, 1998). Nevertheless, inaccurate P-values can be detected when the standard error is large or, else if the number of switches (the number of times the sample configuration changes in the MC run) is low (this may occur when the P-value estimate is close to 0 or 1). Therefore, it is wise to increase the number of batches if the standard error is too large, in particular if it is of the order of P (the P-value) for small P or of the order of 1-P for large P, or else if the number of switches is low (< 1000).
The Markov chain algorithms were first implemented for probability tests, i.e. tests where the rejection zone is defined out of the least likely samples under the null hypothesis. Such tests also had Fisher's preference (e.g. Fisher, 1935); in particular the probability test for independence in contingency tables is known as Fisher's exact test. However, probability tests are not necessarily the most powerful. Depending on the alternative hypothesis of importance, other test statistics are often preferable (see again Cox & Hinkley, 1974 or Lehmann, 1994 for textbook accounts). Efficient tests for detecting heterozygote excesses and deficits (Rousset & Raymond, 1995) were introduced in Genepop from the start (see option 1), and log likelihood ratio (G) tests were introduced with the implementation of the genotypic tests for population differentiation (Goudet et al., 1996). The allelic weighting implicit in the G statistic is indeed optimal for detecting differentiation under an island model (Rousset, 2007) and use of the G statistic has been generalized to all contingency table tests in Genepop 4.0, though probability tests performed in earlier versions of Genepop are still available.