Enter the expected heterozygosity values or variance components to calculate the fixation index (FST) for population genetics analysis.

FST Calculator

Calculate pairwise fixation index from genotype counts or allele data for two populations.

Genotype Counts
Allele Counts / Frequencies
Enter diploid genotype counts for one biallelic locus in each population. The calculator derives allele frequencies, expected heterozygosity, HS, HT, and FST.
Population 1
Population 2
Use raw allele counts when you know the number of A and a alleles. Use frequencies when you already know allele A frequency in each population.
Allele Counts
Allele Frequencies
Population 1
Population 2
When using frequencies, the calculator assumes equal weighting between the two populations and computes q as 1 – p.
Population 1
Population 2

Related Calculators

Fixation Index Formula

The fixation index can be calculated using one of several formulations depending on available data.

Heterozygosity Form

F_{ST} = \frac{H_T - H_S}{H_T}
  • H_T is the expected heterozygosity of the total population (computed from overall allele frequencies)
  • H_S is the mean expected heterozygosity across all subpopulations
  • FST is the fixation index, ranging from 0 (no differentiation) to 1 (complete fixation of different alleles)

Variance Form

F_{ST} = \frac{\sigma^2_S}{\sigma^2_T}

Where the numerator is the variance in allele frequency among subpopulations and the denominator is the total variance of allelic state across the entire population. This can also be expressed as the among-subpopulation variance divided by p(1-p), where p is the overall allele frequency.

Island Model (Migration-Drift Equilibrium)

F_{ST} = \frac{1}{1 + 4N_e m}

This form relates FST to effective population size (Ne) and migration rate (m) under Wright's island model. The product Nm represents the number of migrants per generation. When Nm is large, gene flow homogenizes populations and FST approaches 0. When Nm is small, drift dominates and FST approaches 1.

What Is the Fixation Index?

The fixation index (FST) is a measure of population differentiation due to genetic structure. Introduced by Sewall Wright in 1951 as part of his F-statistics framework, FST quantifies how much of the total genetic variation in a species exists between populations rather than within them. It is one of the most widely used statistics in population genetics, conservation biology, and evolutionary research.

FST belongs to a family of three hierarchical statistics Wright defined: FIS measures inbreeding within individuals relative to their subpopulation, FIT measures inbreeding of individuals relative to the total population, and FST measures differentiation of subpopulations relative to the total. These three statistics are related by the equation (1 - FIT) = (1 - FIS)(1 - FST).

At a single biallelic locus with allele frequency p, the expected heterozygosity is 2p(1-p). FST compares this quantity calculated from overall allele frequencies (HT) against the average calculated separately within each subpopulation (HS). The difference between these two values, normalized by HT, gives the proportion of genetic variance attributable to population subdivision.

Wright's Interpretation Scale

Wright proposed qualitative guidelines for interpreting FST values that remain the standard reference in the field.

FST = 0 to 0.05: Little genetic differentiation. Populations exchange enough migrants to remain genetically cohesive. Most allele frequency differences are due to random sampling rather than true structure. Typical of continuously distributed species with high dispersal capacity, or populations separated recently.

FST = 0.05 to 0.15: Moderate differentiation. Some population structure exists but populations still share most genetic variation. Gene flow is present but insufficient to fully homogenize allele frequencies. Common in species with patchy habitat or moderate dispersal barriers.

FST = 0.15 to 0.25: Great differentiation. Populations have diverged substantially, with meaningful differences in allele frequencies. Gene flow is restricted. Populations may be on separate evolutionary trajectories if isolation persists.

FST > 0.25: Very great differentiation. Populations are highly isolated with minimal gene flow. Different alleles may be approaching fixation in different populations. Often observed between subspecies or across major biogeographic barriers.

Reference FST Values Across Species

Published FST values across species provide context for interpreting results from any new study. These values reflect real genetic structure shaped by each organism's ecology, dispersal ability, and demographic history.

Humans (continental scale): Global FST across major continental groups is approximately 0.09 to 0.12 using neutral markers. Comparisons restricted to non-African populations yield roughly 0.047. Within-continent FST values range from 0.01 to 0.04. European populations show FST values below 0.01 between closely related groups (e.g., Danes vs. Dutch) and up to 0.07 between the most divergent groups (e.g., Sami vs. Sardinians).

Cheetahs: FST values between cheetah subspecies and subpopulations range from 0.22 to 0.50, reflecting severe historical bottlenecks and extreme habitat fragmentation. These are among the highest FST values documented in large mammals and directly inform conservation translocation decisions.

Atlantic cod: Marine fish with high dispersal capacity typically show low FST. Atlantic cod populations across the North Atlantic have FST values of 0.01 to 0.04, consistent with substantial larval dispersal and adult movement connecting populations. However, specific loci under selection (e.g., the Pan I locus) can show much higher differentiation.

Arabidopsis thaliana: As a predominantly self-fertilizing plant with limited seed dispersal, Arabidopsis shows FST values of 0.30 to 0.60 between regional populations, reflecting the combined effects of selfing (which reduces effective migration) and low physical dispersal. This makes it a model organism for studying the extremes of population structure in plants.

Drosophila melanogaster: Cosmopolitan fruit fly populations show FST values of 0.02 to 0.10 across continents, reflecting both human-mediated dispersal and the species' high effective population size, which slows drift-driven differentiation.

FST Estimators: Wright vs. Weir-Cockerham

The original Wright FST is a population parameter defined from true allele frequencies. In practice, allele frequencies must be estimated from finite samples, which introduces bias. Different estimators handle this sampling problem differently.

Wright's original FST: Calculated directly from observed allele frequencies as (HT - HS)/HT. This estimator is simple to compute but is biased upward when sample sizes are small, because sampling variance inflates the apparent among-population variance.

Nei's GST (1973): A generalization of FST to multiple alleles at a locus. Nei's formulation uses observed heterozygosities and includes a correction for finite sample size. It remains widely used but can underestimate differentiation when within-population heterozygosity is high, as shown by Hedrick (2005) and Jost (2008).

Weir and Cockerham's theta (1984): Uses an analysis of variance framework to partition genetic variation into among-population, among-individual-within-population, and within-individual components. This estimator is unbiased with respect to sample size and does not assume equal sample sizes across populations. It has become the standard estimator in most modern population genetics software (e.g., PLINK, VCFtools, Arlequin).

For genome-wide data with many SNPs, the ratio-of-averages estimator (averaging variance components across loci before taking the ratio) performs better than the average-of-ratios approach (computing FST per locus then averaging). The ratio-of-averages gives more weight to informative loci and is less sensitive to loci with low heterozygosity.

Relationship Between FST and Gene Flow

Under Wright's island model, which assumes an infinite number of populations of equal size exchanging migrants at equal rates, FST reaches an equilibrium determined entirely by the product of effective population size and migration rate (Nm). The equilibrium relationship FST = 1/(1 + 4Nm) provides a way to estimate the number of migrants per generation from observed FST values: Nm = (1/FST - 1)/4.

Key thresholds from this relationship: when Nm = 0.25 (one migrant every four generations), equilibrium FST is 0.50. When Nm = 1 (one migrant per generation), FST is 0.20. When Nm = 5, FST drops to 0.048. When Nm = 10, FST is 0.024. This demonstrates that even very low levels of gene flow (just one migrant per generation) substantially reduce population differentiation.

This translation from FST to Nm has important caveats. The island model assumes equal population sizes, symmetric migration, selective neutrality, and migration-drift equilibrium. Real populations violate all of these assumptions to varying degrees. Non-equilibrium conditions (recent bottlenecks, range expansions, secondary contact) can produce FST values that do not reflect current gene flow rates. Despite these limitations, the Nm estimate provides a useful first approximation and remains a standard reporting metric in conservation genetics.

Applications in Conservation Biology

Conservation biologists use FST to identify management units, prioritize populations for protection, and design genetic rescue programs. A population with high FST relative to other conspecific populations harbors unique genetic variation and may warrant independent management. Low FST between populations suggests they function as a single genetic unit and can be managed jointly.

Identifying conservation units: Populations with FST greater than 0.20 relative to others often qualify as evolutionarily significant units (ESUs) under many conservation frameworks. The U.S. Endangered Species Act considers distinct population segments partly on the basis of genetic differentiation metrics including FST.

Detecting isolation: Increasing FST over time in monitored populations signals declining connectivity, which may precede local extinction through inbreeding depression and loss of adaptive potential. Mountain gorilla populations, Florida panthers, and island fox populations on the California Channel Islands have all been assessed using FST to quantify their isolation and genetic health.

Genetic rescue planning: When isolated populations show high FST and signs of inbreeding depression, translocating individuals from genetically distinct populations can restore heterozygosity. The optimal source population balances genetic distinctiveness (sufficient FST to introduce new alleles) against excessive divergence (very high FST that might cause outbreeding depression). In practice, source populations with FST of 0.05 to 0.20 relative to the target are generally considered appropriate candidates.

Negative FST Values

Although FST is theoretically bounded between 0 and 1, estimated values can be negative. This occurs when within-population heterozygosity exceeds total-population heterozygosity, which happens when sample sizes are small or when the true FST is very close to zero. Negative estimates are not biologically meaningful and should be interpreted as indicating no significant genetic differentiation at that locus or across those populations. In genome-wide scans, negative per-locus FST values are common and expected; they average out when computing genome-wide estimates using the ratio-of-averages approach.

Limitations and Modern Alternatives

FST has known limitations that have driven the development of alternative statistics. Hedrick's G'ST (2005) and Jost's D (2008) both address the problem that FST is constrained to low values when within-population heterozygosity is high, even if populations share no alleles. For highly polymorphic markers like microsatellites, FST can be substantially lower than the actual level of differentiation because the denominator (HT) is large.

PhiST extends the FST framework to incorporate molecular distance between alleles (e.g., number of mutational steps between microsatellite alleles), capturing information that FST ignores. For sequence data, FST can be computed from nucleotide diversity (pi) within and between populations, but this conflates ancestral polymorphism with post-divergence differentiation.

Despite these alternatives, FST remains the default reporting metric in most population genetics studies because of its long history, intuitive interpretation, and direct connection to Wright's theoretical framework. Modern best practice is to report FST alongside at least one complementary measure (D, G'ST, or a model-based clustering analysis like STRUCTURE or ADMIXTURE) to provide a more complete picture of population differentiation.