Updated December 2, 2021:
In this project we worked on developing strategies and building a computational approach to utilize the genomic differences in the various soybean ancestors, elite, landraces, cultivars and wild soybeans to create a master list of significant differences for every gene. We used bioinformatics and computational methods to identify genomic differences from four large collections of sequenced soybean lines and have successfully developed the Soy775 panel, consisting of 775 soybean lines that includes a comprehensive list of SNPs and Indel positions as our master catalog. We have also built a new “Allele Catalog” web based tool in SoyKB, where any user can query this Soy775 panel with a gene name or soybean line accession number and get information about such genomic positions, the effect the change has on the gene, and their distribution in the various categories described above. We have applied this to several traits that impact soybean yield and value. We have analyzed the recently cloned major protein gene on chromosome 15 (Glyma.15g049200) and have determined that one soybean accession (PI374207) from the Soy775 putatively contains a novel defunct allele of the gene that could be used in breeding programs to modulate seed protein content. Our developments and discoveries we have made in this project have allowed us to leverage and expand existing whole genome sequencing resources, tools, and technologies to put soybean at the forefront of crop improvement with applied genomics. We have achieved all of our proposed objectives which included generating new knowledge, research publications, automated pipelines, online tools, and a team of multi-disciplinary researchers dedicated to advancing soybean applied genomics.