1. Response surface analyses showed that the largest differences among genomic selection methods are revealed when all phenotypic variability is due to genetic architecture consisting of epistasis. Since there were no differences among genomic selection methods applied to the SoyNAM population it is most likely that the genetic architecture is additive and any genomic selection method can be used for building a predictive model. The most accessible method is RRBLUP implemented in R.
2. Evaluation of accuracy between predicted and observed yield values revealed that it was preferable to combine all data across the SoyNAM families into a single prediction model. Compared to single-family models, this approach increases predictive ability for yield by 44%. The within-family prediction accuracies ranged from 0.19 to 0.62. When testing the ability to predict between and within families simultaneously, predictive abilities for yield were about 0.75.
3. Information from the genomic selection evaluations were then applied to select lines from active breeding populations. These breeding populations came from the University of Illinois, University of Nebraska, and Purdue University programs. During the summer of 2014, the project collaborators tested 7,500 experimental lines from 26 populations in non-replicated trials. These tests were grown in Illinois, Iowa, Indiana and Nebraska. The lines were rated for maturity, plant height and plant lodging and they were harvested to measure seed yield and to provide seed for 2015 tests.
4. A total of 5630 experimental lines from the 26 populations were genotyped using the GBS technology. The sequence data consisted of 4.8 billion good sequencing reads resulting in the production of about 480 billion base pairs of data. Individual lines were scored with 4,693 to 9,007 segregating markers with an average of 6,652 markers. The genetic marker and yield data were used to select lines in the following categories:
• Greatest yield based on marker predictions.
• Greatest yield combined with acceptable protein levels based on marker predictions.
• Greatest yield based only on yield in the 2014 tests.
• Greatest yield based on a combination of marker predictions and 2014 test results.
• Random lines in each population.
5. 1998 lines were selected according the categories above and organized into field tests. These lines, together with parents and check varieties were planted in non-replicated yield trials consisting of a total of 2090 plots at each of four locations. These plots were evaluated for plant maturity, height, lodging seed yield. The results will be compared with the predicted values during the winter of 2016.