Objective 1: Increasing selection intensity and decreasing non-genetic sources of variability through improved progeny row testing
Within the existing pipelines of the co-investigators’ breeding programs, progeny row selection will take place as usual; however, breeders will also participate in the selection experiment to compare the agronomic performance of lines selected by breeders using their usual selection methods to lines selected through prediction of yield performance using new sources of data and information. The selection experiment will be conducted in all states, with equal numbers of lines selected from each breeding program and testing coordinated across locations. Specifically, in years 1 & 2, approximately 5000 progeny rows in each breeding program will be considered for selection using at least three selection categories. Each breeding program is responsible for the general management and experimental design of a typical progeny row test. In years 2 & 3, coordinated preliminary yield trials will be planted by each program to test the performance of lines within the selection categories. The germplasm will advance over years and forward breeding progress will be made.
Purpose: Evaluate increases in the rate of genetic gains attained by selecting in early stages of the breeding pipeline using data integrated from various sources.
Objective 1 deliverables:
• Observed data, selection information, pedigree and plot layout (range-row information), and shipment of seed for planting for breeders at 11 locations.
• Overall data management plan, preliminary analytical pipeline implemented and coordinated by the Rainey lab.
• All breeders’ lines ranked simultaneously for yield breeding value, maturity prediction and a metric of diversity. Lines selected using additional sources of information may provide higher rank-order correlation with the performance of preliminary yield trails.
Objective 2: Increasing selection coefficient and decreasing length of breeding cycle through genomic selection
This objective will continue and make use of currently and previously funded projects on soybean genomic selection that have produced data sets pertinent to testing of this methodology and its optimization. These projects will be rolled into this single objective, and additional activities, such as the development of ultra-cheap, low-density genotyping, will be incorporated to move this methodology forward for the soybean breeding community. The overall aim of this objective is to develop, test, and make available genomic prediction tools for public soybean breeders in the North Central Region.
Objective 2 deliverables:
• A community resource for genomic prediction consisting of a set of soybean lines that can be used to establish genomic prediction to help expedite genetic gain for yield.
• Novel inexpensive and rapid genotyping methods that can be used for genomic prediction and selection.
• Genotype imputation methods and tools to increase marker density and allow connection of data sets collected using diverse genotyping platforms.
Objective 3: Increasing additive genetic variance
This research addresses the fundamental limitation of soybean breeding and one of the key issues in the slow rate of soybean yield increase: lack of genetic diversity in the commercial soybean gene pool. We will address this limitation using three complementary approaches for short-, medium-, and long-range goals to significantly increase the rate of genetic gain by increasing the genetic diversity for yield – the additive genetic variance.
• Databased information on success of parent combinations.
• Method development for selection of parents using genomic prediction models developed in Objective 2.
• High-quality, multi-environment yield and other agronomic performance data for 500 PIs in the USDA Soybean Germplasm Collection.
• Identify yield-marker genotype relationships based on association mapping results from the extensive, high-quality yield dataset.
• Develop predictive model(s) that allow selection of superior high-yield genotypes from the USDA germplasm collection.
• Incorporate high-throughput phenotype data, plant developmental data, and environment data in the models.
• High yielding lines derived from wild soybean and G. tomentella.
• A list of candidate genomic regions and/or haplotypes associated with yield-related traits.
Objective 4: Development of a metric to estimate genetic gains on an annual basis
The purpose of this objective is to develop, unbiased metrics for realized genetic gains for soybean yields that can be implemented within individual soybean breeding programs and across soybean breeding programs. Power, precision, accuracy and reliability of analytic methods will be affected by the underlying genetic architecture, as well as numbers of varieties (lines), families and environments in which the varieties are evaluated. The “stretch-goal” for the first year of this project is to evaluate power, precision, accuracy and reliability of realized genetic gains generated by four analytic methods applied to variety yields generated under ideal genetic architectures consisting of simple additive genetics. Ideal conditions will have to be simulated. Simulated data are always used to determine the performance of analytic methods under ideal conditions because it provides evidence of the best we can expect from analytic methods; the methods will perform no better when encountering real data. The objective criteria will be determined for numbers of varieties (lines), families and environments that are typical at various stages of commercial variety development programs as well as for the limited resources that are typical of public soybean breeding programs. In future years we will validate these and additional methods based on the same criteria, using data from actual field trials conducted by public soybean breeders (objectives 1, 2, & 3) and by commercial soybean breeders.
Objective 4 deliverables:
• Short videos describing history and future developments of genetic gain to non-experts.
• Establish objective criteria for evaluating methods that estimate realized genetic gains.
• Deposit public, commercial and simulation, data resources with agreed upon nomenclature and format rules in a single shared file server.
• Simulation software for generating yields of potential varieties in various stages of field trials.
• Establish the potential range of field and laboratory resources that may be needed to realize RGG of 50% greater than current RGG.