Advanced Breeding Calculations

The Mathematical Foundation of Breeding Success

While intuition and experience guide many breeding decisions, quantitative genetics provides powerful tools for predicting outcomes and optimizing selection strategies. Understanding key statistical concepts and calculations enables small-scale breeders to make more informed decisions, estimate genetic progress, and design more effective breeding programs.

This article demystifies essential breeding calculations, providing practical tools that can be applied with basic spreadsheet software or simple calculators. We’ll focus on concepts most relevant to small breeding operations while maintaining statistical rigour.

Fundamental Statistical Concepts

Population Parameters and Sample Statistics

Understanding the relationship between populations and samples is crucial for applying breeding calculations correctly. Small breeders typically work with limited sample sizes, making this distinction particularly important.

Population vs. Sample:

  • Population: All possible individuals of a particular variety or breeding group
  • Sample: The subset of individuals actually observed and measured
  • Sample size effects: Smaller samples have greater uncertainty
  • Confidence intervals: Range of likely true values given sample limitations

Key Statistical Measures:

  • Mean (average): Sum of values divided by number of observations
  • Variance: Average squared deviation from the mean
  • Standard deviation: Square root of variance, same units as original measurements
  • Coefficient of variation: Standard deviation divided by mean, useful for comparing variability

Distributions and Normality

Most quantitative traits in cannabis follow approximately normal (bell-shaped) distributions, which enables the use of standard statistical methods.

Normal Distribution Properties:

  • Symmetrical: Equal areas above and below the mean
  • 68-95-99.7 rule: Approximately 68% of values within 1 standard deviation
  • Standardization: Converting to z-scores for comparison across traits
  • Outlier identification: Values beyond 2-3 standard deviations warrant investigation

Practical Applications:

  • Selection intensity: Proportion of population selected affects genetic gain
  • Truncation selection: Selecting above/below specific thresholds
  • Multiple trait selection: Balancing improvement across several characteristics
  • Response prediction: Estimating genetic progress from selection decisions

Understanding Heritability

Concept and Components

Heritability quantifies the proportion of observed variation that results from genetic rather than environmental factors. This fundamental concept guides selection strategies and predicts response to breeding efforts.

Variance Components:

Total Phenotypic Variance (VP) = Genetic Variance (VG) + Environmental Variance (VE)

Heritability (h²) = VG / VP = VG / (VG + VE)

Types of Heritability:

  • Broad-sense heritability (H²): Includes all genetic effects
  • Narrow-sense heritability (h²): Only additive genetic effects
  • Realized heritability: Calculated from actual selection responses
  • Breeding applications: Narrow-sense heritability most useful for prediction

Estimating Heritability

Small breeders can estimate heritability using several practical approaches, each with different data requirements and accuracy levels.

Parent-Offspring Regression:

Method: Plot parent values (x-axis) vs. offspring averages (y-axis)
Heritability = 2 × slope of regression line (for single parent)
Heritability = slope of regression line (for mid-parent values)

Example calculation:
- Parent plants: 15%, 18%, 12%, 20% CBD
- Offspring averages: 14%, 16%, 13%, 17% CBD
- Regression slope = 0.65
- Heritability = 0.65 or 65%

Full-Sib Family Analysis:

Data needed: Multiple families with known relationships
Heritability = 4 × (VarBetween - VarWithin) / (VarBetween + VarWithin)

Where:
- VarBetween = variance between family averages
- VarWithin = average variance within families

Realized Heritability from Selection:

h² = Response to Selection / Selection Intensity

Where:
- Response = difference between selected parents and their offspring
- Selection Intensity = difference between selected parents and population mean

Factors Affecting Heritability

Heritability is not a fixed property of a trait but varies with environmental conditions, population structure, and measurement methods.

Environmental Factors:

  • Uniform conditions: Higher heritability in controlled environments
  • Stress conditions: May reveal genetic differences more clearly
  • Measurement precision: More accurate measurements increase apparent heritability
  • Age/stage effects: Heritability may change during plant development

Population Factors:

  • Genetic diversity: More diverse populations show higher heritability
  • Inbreeding level: Affects genetic variance components
  • Population size: Smaller populations may show sampling effects
  • Selection history: Previous selection reduces genetic variance

Calculating Selection Response

Basic Selection Theory

Selection response predicts the genetic improvement achievable through breeding programs. Understanding these calculations helps optimize selection strategies and set realistic expectations.

Fundamental Equation:

Response (R) = heritability (h²) × Selection Intensity (i)

Where Selection Intensity = (mean of selected parents - population mean) / population standard deviation

Selection Intensity Values:

  • Proportion selected: Determines selection intensity
  • 5% selected: i = 2.06 (very intense selection)
  • 10% selected: i = 1.76 (intense selection)
  • 20% selected: i = 1.40 (moderate selection)
  • 50% selected: i = 0.80 (mild selection)

Practical Selection Calculations

Single-Trait Selection Example:

Scenario: Selecting for CBD content
- Population mean: 12% CBD
- Population standard deviation: 3% CBD
- Selected parent mean: 18% CBD
- Estimated heritability: 0.60

Calculation:
Selection Intensity = (18 - 12) / 3 = 2.0
Expected Response = 0.60 × 2.0 × 3 = 3.6% CBD increase
Predicted offspring mean = 12 + 3.6 = 15.6% CBD

Multi-Trait Selection:

When selecting for multiple traits simultaneously:

Response in trait X = hx × ix × σx × (correlation with selection index)

Where:
- hx = heritability of trait X
- ix = selection intensity for trait X
- σx = standard deviation of trait X
- Correlation depends on trait relationships and relative emphasis

Correlated Response

When selecting for one trait, related traits may also change due to genetic correlations. Understanding these relationships helps predict and manage breeding outcomes.

Correlated Response Formula:

CRy = hy × hx × rG × ix × σy

Where:
- CRy = correlated response in trait Y
- hy, hx = heritabilities of traits Y and X
- rG = genetic correlation between traits
- ix = selection intensity for trait X
- σy = standard deviation of trait Y

Practical Example:

Selecting for THC content affects CBD content:
- THC heritability: 0.70
- CBD heritability: 0.65
- Genetic correlation (THC-CBD): -0.80 (negative correlation)
- Selection intensity for THC: 1.50
- CBD standard deviation: 2.5%

Correlated response in CBD = 0.65 × 0.70 × (-0.80) × 1.50 × 2.5 = -1.4%
Selecting for higher THC reduces CBD by approximately 1.4%

Advanced Selection Strategies

Selection Index Methods

Selection indices combine multiple traits into a single score, enabling systematic improvement of several characteristics simultaneously.

Simple Selection Index:

Index = w1 × T1 + w2 × T2 + w3 × T3 + ...

Where:
- w = weight assigned to each trait
- T = standardized trait values

Example: Quality index for cannabis
Index = 0.4 × (THC/20) + 0.3 × (Yield/100) + 0.3 × (Disease resistance/10)

Economic Selection Index:

Index = Σ (economic value × breeding value)

Requires:
- Economic values for each trait
- Breeding values (genetic merit estimates)
- Consideration of trait relationships

Threshold Trait Analysis

Some cannabis traits are “all-or-nothing” characteristics (like sex expression or disease resistance) that require special analytical approaches.

Liability Scale Concept:

  • Underlying liability: Continuous genetic scale
  • Threshold effect: Observable trait appears when liability exceeds threshold
  • Heritability calculation: Requires special formulas and assumptions
  • Selection methods: Focus on individuals near decision boundaries

Practical Applications:

  • Sex ratio: Selecting for stable sex expression
  • Disease resistance: Binary resistance/susceptibility traits
  • Autoflowering: Present/absent photoperiod sensitivity
  • Hermaphrodite tendency: Stress-induced sex reversal

Optimizing Small-Scale Breeding Programs

Population Size Considerations

Small breeding populations face unique challenges related to genetic drift, inbreeding, and selection accuracy.

Effective Population Size:

Ne = 4 × Nm × Nf / (Nm + Nf)

Where:
- Ne = effective population size
- Nm = number of males contributing to next generation
- Nf = number of females contributing to next generation

Example:
Using 3 males and 12 females:
Ne = 4 × 3 × 12 / (3 + 12) = 9.6 ≈ 10 individuals

Minimizing Inbreeding:

  • Breeding system design: Avoid repeated use of same parents
  • Generation overlap: Maintain multiple age classes when possible
  • Outcrossing: Periodic introduction of new genetics
  • Record keeping: Track relationships to avoid close inbreeding

Selection Accuracy

Improving selection accuracy maximizes genetic progress while working within small population constraints.

Factors Affecting Accuracy:

  • Heritability: Higher heritability improves accuracy
  • Number of observations: Multiple measurements per individual
  • Family information: Use of relatives’ performance data
  • Environmental control: Reduce environmental variation

Practical Improvements:

  • Replicated testing: Multiple plants per genotype
  • Standardized conditions: Consistent growing environments
  • Precise measurements: Accurate phenotyping methods
  • Statistical analysis: Proper data analysis techniques

Practical Implementation Tools

Spreadsheet Templates

Creating standardized spreadsheet templates facilitates consistent calculations and data management.

Basic Template Components:

Breeding Calculations Spreadsheet:
1. Data entry section:
   - Individual IDs
   - Parent information
   - Trait measurements
   - Environmental conditions

2. Summary statistics:
   - Means and standard deviations
   - Variance calculations
   - Heritability estimates

3. Selection tools:
   - Ranking functions
   - Selection intensity calculations
   - Response predictions

4. Visualization:
   - Charts and graphs
   - Trend analysis
   - Progress tracking

Example Formulas:

Excel/Google Sheets formulas:
- Mean: =AVERAGE(range)
- Standard deviation: =STDEV(range)
- Variance: =VAR(range)
- Correlation: =CORREL(range1, range2)
- Selection intensity: =(AVERAGE(selected_range) - AVERAGE(population_range))/STDEV(population_range)

Software Recommendations

While complex statistical software exists, small breeders can accomplish most calculations using accessible tools.

Entry-Level Software:

  • Excel/Google Sheets: Adequate for basic calculations
  • R (free): Powerful but requires learning
  • GraphPad Prism: User-friendly scientific analysis
  • JASP (free): Point-and-click statistical analysis

Specialized Breeding Software:

  • Breeding Assistant: Designed for plant breeding
  • GGT Biplot: Genotype-environment analysis
  • PlantBreeding R package: Comprehensive breeding tools
  • TASSEL: Genetics and genomics analysis

Case Studies and Examples

Example 1: Yield Improvement Program

Scenario: Improving flower yield in indoor cultivation

  • Initial population mean: 85g per plant
  • Selected parent mean: 110g per plant
  • Population standard deviation: 15g
  • Estimated heritability: 0.45

Calculations:

Selection intensity = (110 - 85) / 15 = 1.67
Expected response = 0.45 × 1.67 × 15 = 11.3g
Predicted F1 mean = 85 + 11.3 = 96.3g per plant

Expected improvement: 13.3% increase in first generation

Example 2: Cannabinoid Profile Optimization

Scenario: Developing 1:1 THC:CBD cultivar

  • Current THC mean: 18% (target: 12%)
  • Current CBD mean: 4% (target: 12%)
  • Genetic correlation: -0.75
  • THC heritability: 0.70
  • CBD heritability: 0.60

Strategy:

Simultaneous selection approach:
1. Calculate selection index weights
2. Account for negative correlation
3. Predict correlated responses
4. Adjust selection criteria accordingly

This requires iterative calculations and careful balance of conflicting objectives.

Example 3: Disease Resistance Breeding

Scenario: Improving powdery mildew resistance

  • Trait type: Threshold (resistant/susceptible)
  • Population frequency: 25% resistant
  • Selection goal: 80% resistant in next generation
  • Estimated liability heritability: 0.40

Approach:

1. Convert frequencies to liability scale
2. Calculate selection thresholds
3. Predict response on liability scale
4. Convert back to frequency scale
5. Estimate generations required for target frequency

Common Pitfalls and Limitations

Statistical Assumptions

Understanding the limitations of breeding calculations helps avoid overconfidence and unrealistic expectations.

Key Assumptions:

  • Normal distributions: Many traits may not be perfectly normal
  • Linear relationships: Parent-offspring relationships assumed linear
  • Stable heritability: Values may change with environment or population
  • Random mating: Deviations affect variance component estimates

Practical Considerations:

  • Sample size effects: Small samples reduce reliability
  • Environmental interactions: Genotype × environment effects
  • Non-additive genetics: Dominance and epistasis complicate predictions
  • Long-term effects: Selection changes population parameters over time

Managing Expectations

Breeding calculations provide estimates, not guarantees. Environmental factors, random events, and genetic complexities can cause actual results to differ from predictions.

Realistic Expectations:

  • Prediction intervals: Actual values likely within reasonable range
  • Multiple generations: Major changes require sustained selection
  • Population maintenance: Balance improvement with genetic diversity
  • Validation: Compare predictions with actual outcomes

Integration with Modern Breeding

Genomic Information

As genetic testing becomes more accessible, integrating genomic data with traditional breeding calculations enhances selection accuracy.

Genomic Selection:

  • Marker-assisted selection: Use DNA markers for trait prediction
  • Genomic breeding values: Combine marker and phenotypic information
  • Selection accuracy: Often higher than phenotype-only selection
  • Early selection: Enable selection before trait expression

Data Management

Effective breeding programs require systematic data collection and management systems that support statistical analysis.

Database Design:

  • Standardized formats: Consistent data entry procedures
  • Relationship tracking: Maintain pedigree information
  • Quality control: Error checking and validation procedures
  • Analysis integration: Connect data collection to statistical software

Future Directions

Emerging Technologies

New technologies continue to improve the precision and efficiency of breeding calculations and selection methods.

Technological Advances:

  • High-throughput phenotyping: Automated trait measurement
  • Machine learning: Pattern recognition and prediction improvement
  • Cloud computing: Access to advanced statistical methods
  • Mobile applications: Field data collection and analysis

Accessibility Improvements:

  • User-friendly interfaces: Simplified software for non-statisticians
  • Educational resources: Training materials and tutorials
  • Collaborative platforms: Shared tools and databases
  • Cost reduction: Decreasing technology costs improve access

Conclusion

Advanced breeding calculations provide powerful tools for optimizing selection decisions and predicting genetic progress. While these methods require statistical understanding, small-scale breeders can successfully apply key concepts using accessible software and simplified approaches.

The most important principles are understanding heritability, calculating selection response, and managing expectations about prediction accuracy. These tools enable more systematic breeding approaches and help optimize limited resources for maximum genetic improvement.

Start with basic concepts like heritability estimation and selection response calculations. Build experience with simple examples before attempting complex multi-trait selection strategies. Remember that breeding calculations guide decisions but cannot replace sound breeding judgment and practical experience.

As cannabis genetics research advances and technology costs decrease, sophisticated breeding tools become increasingly accessible to small-scale operations. Investing time in understanding these concepts now positions breeding programs for future success and competitive advantage in an evolving industry.

Resources

  1. Falconer, D. S., & Mackay, T. F. C. (1996). Introduction to Quantitative Genetics (4th ed.). Longman Scientific & Technical, Essex, England. https://doi.org/10.1093/jhered/89.5.467

  2. Lynch, M., & Walsh, B. (1998). Genetics and Analysis of Quantitative Traits. Sinauer Associates, Sunderland, MA. https://doi.org/10.1093/jhered/89.5.467

  3. Bernardo, R. (2010). Breeding for Quantitative Traits in Plants (2nd ed.). Stemma Press, Woodbury, MN. ISBN: 978-0-9720724-1-0

  4. Hallauer, A. R., Carena, M. J., & Miranda Filho, J. B. (2010). Quantitative Genetics in Maize Breeding (3rd ed.). Springer, New York. https://doi.org/10.1007/978-1-4419-0766-0

  5. Mudge, E. M., et al. (2018). Leaning on breeding and chemistry to understand cannabinoid genetics. Trends in Genetics, 34(7), 505-516. https://doi.org/10.1016/j.tig.2018.04.004

  6. Stack, G. M., et al. (2021). Season-long characterization of high-cannabinoid hemp (Cannabis sativa L.) reveals complex genotype × environment interactions for inflorescence yield and cannabinoid production. GCB Bioenergy, 13(4), 546-561. https://doi.org/10.1111/gcbb.12793

  7. Sawler, J., et al. (2015). The genetic structure of marijuana and hemp. PLoS One, 10(8), e0133292. https://doi.org/10.1371/journal.pone.0133292

  8. Grassa, C. J., et al. (2018). A complete Cannabis chromosome assembly and adaptive admixture for elevated cannabidiol (CBD) content. BioRxiv, 458083. https://doi.org/10.1101/458083


If you found this post interesting, consider hitting the “Buy me fertilizer” button below to chuck a few dollars in the pot. Your support helps this educational resource keep growing!

[This post assumes legal hemp/cannabis breeding in compliance with all applicable local laws and regulations.]

Share this post