OPTION 1 Hardy Weinberg Exact Tests
(adapted from the original Genepop 4.0 documentation)
Three distinct tests are available, all concerned with the same null hypothesis Ho (= random union of gametes). The difference between them is the construction of the rejection zone. For the Probability-test (sub-option 3), the probability of the observed sample is used to define the rejection zone, and the P-value of the test corresponds to the sum of the probabilities of all tables (with the same allelic counts) with the same or lower probability. This is the "exact HW test" of Haldane (1954), Weir (1990b), Guo and Thompson (1992) and others. When the alternative hypothesis (H1) of interest is heterozygote excess or deficiency, more powerful tests than the probability-test can be used (see Rousset and Raymond, 1995). One of them, the score test (U test), is available here, either for H1 = heterozygote deficiency (sub-option 1) or H1 = heterozygote excess (sub-option 2). The multi-samples version of these two tests are accessible through sub-options 4 or 5.
Two distinct algorithms are available:
NB. Much higher values for the MC parameters are allowed for the PC version of Genepop. For greater control, download the software to a local machine. Visit http://kimura.univ-montp2.fr/%7Erousset/Genepop.htm.
For all tests concerned with sub-options 1-3, there are three possible cases. The number of distinct alleles at each locus in each sample is
no more than 4: Genepop will give you the choice between the complete enumeration and the MC method. If you have less than 1000 individuals per sample, the complete enumeration is recommended. Otherwise, the MC method could be much faster. But there are no general rules, results are highly variable, depending also on allele frequencies.
always 5 or more: Genepop will automatically perform only the MC method.
sometimes higher than 4, sometimes not: For cases where the number of alleles is 4 or lower, Genepop will give you the choice between both methods. For the other situations (5 alleles or more in some samples), the MC method will be automatically performed.
Whether one wants enumeration or MC methods to be performed can be specified on the input form.
Several important results are provided for each test by this option:
If S.E. is too large (say S.E. > 0.01), it is wise sometimes to run the analysis again, and increase the number of batches (if you tried 100 for the first trial, use 200 next, with C = 1000). How close the estimate is to the true value depends on the product BxC: the larger, the better.
For sub-option 3, a global test across loci or across sample is constructed using Fisher's method. This method (sometimes conservative because discrete probabilities are analyzed), is only performed for convenience and its relevance should be first established (e.g. statistical independence of loci).
General statistical theory shows that there is no uniformly better way to combine P-values of different tests. When an alternative model is specified, it is possible to find a better way of combining results from different data sets than Fisher's method, and usually not by combining P-values. In the present context one such method is the multisample score test of Rousset & Raymond (1995), which defines a global test across loci and/or across samples generalizing the tests of sub-options 1 and 2. The global tests are performed by sub-options 4 and 5, only by the MC algorithm. Independence of loci is also assumed for these global tests. The output file reports global P value estimates and standard errors per population, per locus, and over all loci and populations. For each global P value, the average number of switches per test combined is also reported. Since it is tempting to reduce the chain length parameters in this option, special care is needed in
checking this accuracy diagnostic (see appendix). This option generates several large temporary files. The space used temporarily by Genepop can be estimated as: (# of Loci+# of pop+1)*batches*(iterations per batch)*8 octets. For example it will require about 240 Mo of temporary hard disk space if you have 10 loci, 50 samples and if you use a chain of 500,000 steps
(100 batches of 5000 iterations).
Results are returned via your web browser which you can then save to you local machine. You may also choose to have them emailed to you.
Please report any problems or bugs that you encounter with the web version of this option to Eleanor Morgan
Last Modified on
September 24, 2009
by Eleanor Morgan
[Genepop Option 1] [Genepop Home Page] [Bibilography]