HELP WITH GENEPOP

OPTION 3  Population differentiation
(adapted from the original Genepop v3.1b documentation)

Tests of sub-options 1 or 2 (genic differentiation) are concerned with the allelic distribution of alleles in the various populations. The null hypothesis tested is Ho: "the allelic distribution is identical across populations". For each locus, the test is performed on a contingency table like this one:


          Sub-Pop.  Alleles       
                    1    2   Total
                    _______       
           1        14   46   60  
           2        6    76   82  
           3        10   74   84  
           4        4    58   62  
                    _______       
          Total     34   254  288 

For each locus, an unbiased estimate of the P-value of the probability test (or Fisher exact test) is performed, as described by Raymond and Rousset (1995). For sub-option 2, the test is the same, but it is performed automatically for all pairs of populations for all loci.

Tests of sub-option 3 or 4 (genotypic differentiation) are concerned with the distribution of genotypes in the various populations. The null hypothesis tested is Ho: "the genotypic distribution is identical across populations". For each locus, the test is performed on a contingency table like this one:

                  Genotypes:
                  -------------------------
                  1    1   2   1   2   3
         Pop:     1    2   2   3   3   3   All
         ----
         Pop1     142  27  0   13  1   0   183
         Pop2     149  20  0   11  0   4   184
         Pop3     131  12  0   9   0   1   153
         Pop4     119  22  1   10  0   0   152
         Pop5     120  17  1   10  1   0   149
         Pop6     134  18  2   15  0   0   169
         Pop7     116  15  1   10  1   1   144
         Pop8     214  41  3   14  2   1   275
         Pop9     84   17  0   7   2   0   110
         Pop10    107  18  0   15  3   0   143
         Pop11    134  32  1   21  4   0   192
         Pop12    105  26  1   11  1   4   148
         Pop13    97   19  2   23  4   0   145
         Pop14    95   28  3   19  3   1   149
                                                
         All:     1747 312 15  188 22  12  2296

An unbiased estimate of the P-value of a log-likelihood (G) based exact test is performed (Goudet et al. 1996). The principle of this test is the same as the probability test (or Fisher exact test). For the probability test, the P-value is calculated as the sum of the probabilities of all tables (with the same marginal values as the observed one) with a lower or equal probability than the observed table. For this G-based test, the statistics defining the rejection zone (which is the probability of the observed table for the probability test) is the G value computed on the genic table derived from the genotypic one, so that the rejection zone is defined as the sum of the probabilities of all tables (with the same marginal values as the observed one) having a higher or equal G value than the observed one. The universe explored by the Markov chain concerns the genotypic tables having the same marginal values as the observed one, and the statistics defining the rejection zone are computed on each genic table associated with each genotypic one. ee Goudet et al. 1996 for the choice of these statistics.

For sub-option 4, the test is the same but is performed automatically for all pairs of populations for all loci.

Running sub-options 1-4

An unbiased estimate of the P-value for all sub-options is performed using a Markov chain method slightly different from the one described in 1989 by Guo and Thompson in an unpublished report (Technical report No. 187, Department of Statistics, University of Washington, Seattle, USA).

You will be prompted to enter three numbers to feed the Markov chain:

The product of the last two numbers defines the length of the chain: the longer, the better. Its division in batches is useful to define a standard error of the overall estimate.

The program provides the P-value and the standard error associated with the estimate. When BxC increases, S.E. decreases. If S.E. is too large (let's say S.E. > 0.01), then you can rerun the analysis, and increase the number of batches (if you tried 50 in the first trial, use 100 next, with C = 1000).

OUTPUT. Results are returned via your web browser if you select the HTML option for Output Delivery. You can also have the results emailed to you if you choose this option (which is advisable for datafiles with 1) numerous loci or 2) numerous unique alleles for each locus or 3) increased Markov chain parameters). All contingency tables are saved in the output file. Estimates of P-values are indicated, as well as (for sub-options 1 and 3) a combination of all test results (Fisher's method), which assumes a statistical independence across loci.

Please report any problems or bugs that you encounter with the web version of this option to Eleanor Morgan


Last Modified on January 21, 1999 by Eleanor Morgan
[Genepop Option 3] [Genepop Home Page] [References]