Project Details: Field phenotyping using machine learning tools integrated with genetic mapping to address heat and drought induced flower abortion in soybean (2024)

2024

Field phenotyping using machine learning tools integrated with genetic mapping to address heat and drought induced flower abortion in soybean

Home

Contributor/Checkoff:

North Central Soybean Research Program

Category:

Sustainable Production

Keywords:

Abiotic stressAgricultureGenomics

Parent Project:

Field phenotyping using machine learning tools integrated with genetic mapping to address heat and drought induced flower abortion in soybean

Lead Principal Investigator:

Krishna Jagadish, Texas Tech University

Co-Principal Investigators:

Doina Caragea, Kansas State University
William Schapaugh, Kansas State University
Gunvant Patil, Texas Tech University
Glen Ritchie, Texas Tech University
Hamed Sari-Sarraf, Texas Tech University
Impa Somayanda, Texas Tech University
Henry Nguyen, University of Missouri
Avat Shekoofa, University of Tennessee-Institute of Agriculture

+7 More

Project Code:

60065

Contributing Organization (Checkoff):

North Central Soybean Research Program

$400,156

Leveraged Funding (Non-Checkoff):

None

Institution Funded:

Texas Tech University

$400,156

Final Report

Brief Project Summary:

A 30 to 80% flower drop in soybeans across different U.S. regions is an unresolved and persisting bottleneck that has limited soybeans ability to achieve full genetic yield potential. The multi-regional team will improve an image-based field phenotyping system, integrated with deep-learning tools to capture genetic variation in flower abortion and pod retention under different scales i.e. greenhouse and field conditions under varying soil and management scenarios. Utilizing contrasting genotypes identified during summer 2023, we will discover molecular mechanisms controlling flower abortion under harsh climatic conditions. This knowledge will help discover molecular switches to enhance flower and pod retention, and enhance yield potential under diverse environmental conditions.

Key Beneficiaries:
#farmers, #geneticists, #physiologists, #public and private soybean improvement groups

Information And Results

Project Summary

A 30 to 80% flower drop in soybeans grown across different regions in the US is an unresolved and
persisting bottleneck that has limited soybeans ability to achieve the full genetic yield potential. The major
challenge has been the lack of robust, field-based high throughput phenotyping and analysis tools to
capture temporal variation in flower abortion and pod retention across large genetically diverse
germplasm. The multi-regional (KS, MO, TN and TX) and trans-disciplinary team will develop an imagebased
field phenotyping system, integrated with deep-learning tools to capture large genetic variation in
flower abortion and pod retention under different soil and climatic conditions. A genetically diverse set of
50 genotypes including late group III and early group IV will be tested under natural dryland conditions
in MO and KS, and under irrigated and severe drought and heat stress conditions in TX and TN. Using
contrasting lines from information generated from year 1, molecular mechanism that control flower
abortion and pod retention will be determined. This fundamental knowledge will help discover molecular
switches to enhance flower and pod retention, and thereby enhance yield potential under diverse
environmental conditions. The proposed project will address - Tools and Technology for Soybean
Improvement and utilizing these to induce Extreme Weather Resiliency. In summary, the overall goal is
to increase flower and pod retention by 20 to 30%, with a potential to enhance yields by 10 to 15%,
ultimately translating to an additional 400 million dollars to the national soybean industry.

Project Objectives

• Continue to explore the genetic diversity in flower abortion under different soil moisture and climatic conditions using a diverse set of landraces and elite genotypes
• Improve the image-based field phenotyping system and deep-learning tools to document temporal dynamics in flower abortion and pod retention in diverse soybeans grown under field conditions
• Identify molecular mechanisms controlling flower abortion under heat and drought conditions using contrasting genotypes that differ in proportion of flower abortion

Project Deliverables

• Range in phenotypic variation associated with flower abortion and pod retention in different maturity groups of soybean grown under different soil, moisture and climatic conditions determined.
• Field image-based phenotyping protocols established to track flowers and pods across all four participating institutes, with different rates of flower abortion captured
• Deep learning tool developed can analyze images and acquire temporal changes in flower numbers with minimal human interference, from field-based images collected across all four locations
• Candidate genes identified for flower abortion under favorable and drought and heat stress conditions.

Progress Of Work

Updated April 25, 2024:
Project title - Field phenotyping using machine learning tools integrated with genetic mapping to address heat and drought induced flower abortion in soybean.

Participating institutions – Texas Tech University, Kansas State University, University of Missouri, and University of Tennessee.

Goals & Objectives

Long-term Goal – Develop soybean cultivars with 20 to 30% lower flower abortion under favorable to challenging environmental conditions, leading to about 10-15% increase in yield potential.

Objectives (Year 2)
• Continue to explore the genetic diversity in flower abortion under different soil moisture and climatic conditions using a diverse set of landraces and elite genotypes.
• Improve the image-based field phenotyping system and deep-learning tools to document temporal dynamics in flower abortion and pod retention in diverse soybeans grown under field conditions.
• Identify molecular mechanisms controlling flower abortion under diverse climatic conditions.

Objective 1 - Explore the genetic diversity in flower abortion under different soil moisture and climatic conditions using a diverse set of landraces and elite genotypes
A diverse selection of 50 soybean genotypes, classified within groups III and IV, were meticulously chosen based on considerations of maturity and lodging scores from Year 1 multi-location studies. Seeds were provided by the University of Missouri for implementation of trails across multiple locations, namely Kansas State University (KS; Manhattan), Texas Tech University (TX; Lubbock), and The University of Tennessee (TN; Jackson), marking the beginning of the second experimental year of our project. All locations are currently in the preparatory phase for planting, scheduled in May.

This year, a significant advancement in our methodology involves the integration of QR codes (Figure 1) onto all plot labels across all locations. This technological enhancement streamlines imaging processing and storage procedures, facilitating efficient data management.

Each genotype will be planted in four row plots with a spacing of 30” between them. Within each row, seeds will be planted at a density of 8 seeds per foot. Furthermore, to ensure robustness and reliability of results, each experimental setup will be replicated three times within each location.

At Lubbock - TX we will incorporate two distinct irrigation regimes -100% ET and 50% ET leveraging the region's natural hot summers to induce heat stress. These irrigation treatments will be implemented via a sub-surface drip irrigation system. Drought stress (50% ET) will specifically start and persist throughout the flowering stage.

Meanwhile, in Jackson - TN, a greenhouse experiment will be conducted under stress conditions, focusing on four lines exhibiting low flower abortion rates and another four lines with high flower abortion percentages, based on year 1 findings. This targeted investigation aims to understand soybean response mechanisms that result in differential flower abortion under control and stress conditions.

Objective 2 – Improve the image-based field phenotyping system and deep-learning tools to document temporal dynamics in flower abortion and pod retention in diverse soybeans grown under field conditions.
At each location, uniform platforms for imaging are being constructed, ensuring consistency across all sites (Figure 2). The imaging process will start using a single GoPro Hero 11 camera, with additional cameras added as the crop matures, if necessary. Parameters for camera setup will remain consistent with those utilized in the previous year, ensuring continuity and comparability of data.

To streamline the data acquisition process, a Python program for generating QR codes and label IDs (Figure 1) has been developed for common use across all locations. Additionally, a program for detecting QR codes within the recorded videos has been successfully developed and will be integrated into the model in subsequent stages.

The forthcoming task involves the development of a program designed to extract videos from the cameras and organize them into location-specific folders, incorporating field specifications such as row numbers and the number of cameras per row. This program will greatly simplify the process of video collection and enhance data management efficiency for the machine learning-based flower and pod tracking algorithms.

Furthermore, significant effort has been made in refining the accuracy of the pod and flower tracking models through rigorous training processes, aimed at maximizing precision and reliability in data analysis.

Objective 3 - Identify molecular mechanisms controlling flower abortion under diverse climatic conditions.
To identify the molecular mechanism involved in flower abortion, we have selected the most contrasting accessions from FY23 multi-location trials for bulk transcriptomics analysis under controlled greenhouse conditions. Eight accessions, four lines exhibiting low flower abortion rates and another four lines demonstrating high flower abortion rates, will be evaluated under controlled greenhouse conditions to eliminate the variation associated with environmental factors.

Following tissues will be carefully harvested from 3rd internode to ensure the developmental stage and flower position. Tissue samples including pedicel (stalk connecting to stem), flower bud, partially opened, fully opened, and flower after anthesis (initiation of senescence) will be collected periodically. Following the sample collections, RNA will be extracted from at least 3 replicates/stage/sample and will be subjected to RNA sequencing using commercial vendor. Approximately, 144 samples (8 genotypes X 6 stages X 3 replications) will be collected. RNAseq data will be analyzed using the standard pipeline to identify differentially expressed genes as detailed in our publication Chen et al. (2016; PMID: 27486466).

View uploaded report PDF file

Final Project Results

Updated January 25, 2025:
Final report (January 2024 to December 2024)

Note – Figure and table numbers are retained here while all the figures and tables can be accessed through the PDF version of the final report for 2024

Project funded by North Central Soybean Research Program

Project title - Field phenotyping using machine learning tools integrated with genetic mapping
to address heat and drought induced flower abortion in soybean.

Participating institutions – Texas Tech University, Kansas State University, University of Missouri, and University of Tennessee.

Goals & Objectives
Long-term Goal – Develop soybean cultivars with 20 to 30% lower flower abortion under favorable to challenging environmental conditions, leading to about 10-15% increase in yield potential.

Objectives (Year 2)
• Continue to explore the genetic diversity in flower abortion under different soil moisture and climatic conditions using a diverse set of landraces and elite genotypes.
• Improve the image-based field phenotyping system and deep-learning tools to document temporal dynamics in flower abortion and pod retention in diverse soybeans grown under field conditions.
• Identify molecular mechanisms controlling flower abortion under diverse climatic conditions.

Objective 1 - Explore the genetic diversity in flower abortion under different soil moisture and climatic conditions using a diverse set of landraces and elite genotypes
Texas Tech University:
The 50 genotypes were planted on June 5th under two distinct irrigation regimes. One field was irrigated to maintain 80% evapotranspiration (ET) throughout the experiment, while the 40% ET regime was implemented only during the flowering phase. Flowering began on July 11th (Figure 1). Both imaging and manual flower counting were conducted until the end of the flowering period every three days. Each plot was identified with QR code label. Pod imaging began on August 27th and continued weekly until all lines had reached the R6/R7 stage. Harvesting was completed across the locations and sample processing is completed for agronomic data.
University of Missouri:
Planted 50 genotypes on May 22nd and Harvested on 9-11-2024
University of Tennessee:
Planted 50 genotypes on May 30th and Harvested on 9-9-2024
Kansas State University:
Planted 50 genotypes on May 29th. and Harvested on 8-31-2024

All locations (Figures 2 and 3) followed the same protocol developed by Texas Tech University for manual flower counting and imaging to ensure uniform and high-quality data collection.

Results:
The results from 11 diverse lines in Texas Tech University and the University of Missouri (Figure 4) showed that lines IA3023, PI556511, HS6-3976, and CL0J173-6-8 had the lowest flower abortion rates in Texas, while PI552538, PI556511, LG05-4464, and CL0J173-6-8 had the lowest rates in Missouri, indicating different genetic responses across environments. Interestingly, PI556511, LG05-4464 and CL0J173-6-8 were common to both locations, having lower flower abortion indicating potential genetic sources for developing cultivars with wider adaptation and reduced rate of abortion.

Texas Tech, the lines PI535648, HS6-3976, LG05-4832, CL0J173-6-8 and LD02-9050 exhibited lower flower abortion percentage than in Missouri and Kansas, likely due to irrigation applied in Texas that mitigated the stress of higher temperatures, helping to maintain flower and pod development without excessive abortion. The fluctuating weather in Missouri, including waterlogging, fungal infections, or high humidity, could have contributed to higher stress levels.

At Texas Tech University (Figure 5) the same lines were grown under drought conditions (40% ET). As expected, higher flower abortion rates were observed for most lines under drought, but lines PI534648, K17-6388, LG05-4832 PI533654, and LD02-9050 recorded lower level of abortion under stress conditions. This could be explained by adaptive drought tolerance mechanisms, including enhanced root growth in response to moderate water stress (40% ET), enabling these lines to retain flowers as a survival strategy.

Additional data on maturity days was collected across all locations (Figure 6), with values ranging from 98 to 114 days. Lodging scores varied from 0 to 2.5, but majorly confined to 0 to 1.5 (Figure 7), indicating that these lines are well-suited for phenotyping using the imaging platform and for breeding purposes. Additionally, no significant differences in yield (Figure 8) were observed among the lines within each location, except in Tennessee, where lines CL0J173-6-8, K17-6388, IA3023, LD02-9050, PI533654, and PI534648 demonstrated higher yields. Across locations, Kansas recorded higher yields for most lines followed by Texas.

Plant height (Figure 8) was consistently greater for all lines grown in Missouri, while the same lines exhibited shorter heights in Texas. This suggests that plant height is not a determinant of grain production for these lines. Moreover, the high temperatures experienced in Texas (~100°F) did not significantly reduce yields, likely due to the mitigating effects of good soil structure, proper nutrition, irrigation, and effective crop management practices. Lastly, the 100-seed weight results highlight that Kansas, Texas, and Tennessee recorded higher seed weights for most lines.

Two greenhouse experiments were performed in 2024 in Tennessee and Texas. In University of Tennessee plants were assigned to one of the two treatments: severe stress (SS) or well-watered (WW). Within each line, five plants were subjected to stress, while three under well-watered condition, serving as a reference for calculating the normalized transpiration rate (NTR). Daily flower counts commenced at the onset of flowering and continued until flowering ceased.

The University of Tennessee collected data in 2024 from eight contrasting soybean lines exhibiting high and low flower abortion rates (Figure 9), selected based on the field data collected in Summer 2023. They conducted a greenhouse experiment with four lines known for high flower abortion (PI 567638, PI 603583, PI 567398, and PI 423926) and four lines characterized by low flower abortion (PI 506862, PI 80837, PI 437690, and PI 548318) to investigate the impact of severe drought conditions on flower dynamics (Figure 10).

In Figure 10, three high flower abortion lines—PI 567398, PI 567638, and PI 603583—showed pronounced sensitivity to severe water stress. In contrast, the low flower abortion line PI 548318 stood out, producing the highest number of flowers under the severe stress treatment. Interestingly, despite the severe stress conditions, some lines managed to produce around 100 flowers during the flowering phase, with most of these being low flower abortion lines. To build on these findings, a second trial will be conducted under moderate stress conditions to better assess flower abortion, as the severe stress caused unrealistically high levels of flower loss.

In Texas Tech University a small trial in a greenhouse was conducted to study flower dynamics per node in two soybean lines. Significant differences were observed between the nodes (Figure 11). For K17-6388, 27% and 21% of the total flowers were located on the first and second nodes, respectively. In contrast, William 82 showed a more even distribution of flowers across nodes 1 to 4, with 13%, 18%, 10%, and 12%, respectively. Next year, this trial will be expanded to field conditions and tested on the high and low flower abortion lines selected from the 2023/2024 seasons. The study will examine flower dynamics per node under contrasting irrigation regimes (80% and 40% ET) and contrasting genotypes, aiming to better understand soybean flowering dynamics per node as well as pods.

Objective 2 – Improve the image-based field phenotyping system and deep-learning tools to document temporal dynamics in flower abortion and pod retention in diverse soybeans grown under field conditions.
All locations acquired an RC car crawler (Figure 12), which operates at a slower speed (40s per 0.3 meters) than the model used last year to ensure improved and higher quality image capture. As with the previous year, GO PRO Hero 11 cameras were mounted, with two to four cameras deployed to capture images of the entire height of the plants. The platform was able to navigate effectively even when the rows were covered by leaves, with the compact size of the RC car having minimal disturbance to the plants.

Training of the Machine learning model – flower tracking
In the initial stages of the project, the model was trained to detect flowers as a single class without distinguishing between different stages of the flowers. However, as the project is evolving, it is becoming clear that distinguishing between new flowers and old flowers can be more helpful (Figure 13) to avoid or reduce double counts.

Transitioning from a single-class to a two-class model allows for more nuanced analysis. By classifying flowers into two distinct categories. The model can provide more detailed information on the stage-by-stage flower counts, potentially enabling more accurate predictions of total and aborted flowers.
Challenges in Two-Class Prediction: While two-class prediction offers significant advantages, it also presents several challenges. Labeling data for two distinct classes requires precise definitions of these stages and consistent labeling across the dataset. This process is more complex than single-class labeling, as each flower must be accurately categorized. Training the model to differentiate between new and old flowers demands a larger and more diverse dataset. The model must recognize subtle differences in flower appearance, which can be influenced by factors like lighting, angle, and the plant's overall condition. The accuracy of the model's classification is highly dependent on the quality of the training data and the model's robustness. Misclassifications are possible, particularly when the visual cues distinguishing between the two classes are ambiguous.

Relabeling Progress: As part of the transition to two-class prediction, the original videos are being relabeled to reflect the new classification scheme. This relabeling process is in progress, and each flower instance in the original dataset is reviewed to determine whether it should be classified new or old. This process involves a combination of manual labeling and automated tools. A quality control protocol has been established to maintain consistency and accuracy in the relabeling process. This involves cross-checking labeled data with Dr. Juliana Espíndola (specialist in soybean flower morphology) to ensure that the labels are correct and consistent across the dataset.

Implications and Future Work: The shift to two-class prediction enables more detailed analysis and potentially leads to better counting estimates. However, the success of this approach depends on the accuracy of the model. As we continue to refine the model, ongoing evaluation and adjustment will be necessary. Future work may involve exploring the potential for further classification, such as distinguishing flower clusters, when identifying the number of flowers per node is not feasible. .
Flower tracking has been tested using several algorithms, including DeepSORT, Byte Track, OC-SORT, and SORT. Among these, OC-SORT and SORT demonstrated the most promising results based on tests conducted with videos from all locations. Notably, the SORT algorithm achieved a Multi-Object Tracking Accuracy (MOTA) of 0.964 (Table 1), making it the chosen method for flower tracking in the 2024 videos. However, testing with the 2024 videos has not yet commenced due to the large volume of data being uploaded to our Amazon cloud (AWS). Currently, all participating partners are in the process of uploading their videos.

Training of the Machine learning model – Pods
Pod Detection/Segmentation
Recognizing the intricate shape of soybean pods, we have opted for an instance segmentation method (Figure 14), as opposed to bounding box object detection, for the task of identifying and counting the pods in a frame. The instance segmentation approach enables us to obtain precise segmentation masks of the pods, ensuring a more accurate representation of their complex structures.

We selected images of plants around the R6/R7 stage, which we visually found to be best for counting the pods. Using images from the R6/R7 stage, we did a first round of annotation for approximately 50 images/frames. We marked as “pod” all visible fragments of a pod, without specifically keeping track of fragments that belong to the same occluded pod, when applicable. We trained a Mask R-CNN model on the annotated images and visually assessed its performance. The model was able to detect pods, but it identified individual fragments of occluded pods as distinct “pods”, given how the training images were annotated. As our goal is to count pods in a video, this could lead to an artificially inflated number of pods. To mitigate this issue, we subsequently designed an annotation scheme in which fragments of an occluded pod are annotated together as just one pod. We annotated approximately 300 images/frames with the new annotation scheme and trained another model.

The images in Figure 14 are examples of predictions made by the model trained using annotations performed with the new scheme that takes occlusions into account. These images are used for testing the model and have not been used for training the model.
The revised model can accurately detect many of the actual pods, including some occluded pods. Given the complexity of the task at hand and the relatively small number of images used to train the model, it is expected that the detection can be significantly improved with a larger number of images, especially in the case of clusters of pods and occluded pods (which are both less represented in the dataset compared to the more visible, less crowded pods).

Pod Detection and Tracking
To avoid double-counting pods that appear in multiple frames in a video, we have also worked on pod tracking informed by pod detection. Towards this goal, we have trained a base Faster R-CNN architecture and subsequently used the Faster R-CNN detection model to track the pods using tracking methods such as SORT, OC-SORT and Byte-SORT. As the tracking model is highly dependent on the detector, we are also training a YOLO-v8 model for pod detection and the best model between Faster R-CNN and YOLO-v8 will ultimately be used in our detection/tracking system. To facilitate evaluation of the tracking models, we are annotating 4 videos using a tool called CVAT. Each video has a resolution of 1080x1920 at 15 frames per second. It is expected that the annotation of these videos will result in an adequate amount of data for proper evaluation of the tracking models.
All participating locations are finishing imaging pods at R6/R7 stage and will upload them into the Amazon cloud (AWS). After all locations obtain videos (for pod and flowers) processing of the collected and uploaded videos will be initiated to capture flower and pod counts.

Preliminary Results: Two contrasting lines (Figure 15), one with high flower abortion (K17-6388) and one with low flower abortion (IA3023), were tested using machine learning models for flower and pod counts. The results revealed a significant difference between IA3023 and K17-6388, consistent with observations from manual counts. Additionally, the model successfully detected no significant differences between the two irrigation regimes applied in the Texas Tech experiment for these two lines, which aligned with manual count observations. For pod counting, the model similarly identified no differences between the irrigation regimes, as reflected in the manual counts. This initial testing of the models for field-based counts demonstrates their great potential for predicting flower abortion in the future. Both models will undergo further improvement, as outlined earlier, to enhance accuracy and precision. The models are in the process of re-training based on this first counting to improve accuracy before counting all the 50 lines videos. Next year, imaging both sides of the row are expected to increase the detection and counting of flowers and pods, addressing current limitations.

Objective 3 - Identify molecular mechanisms controlling flower abortion under diverse climatic conditions.
In Texas Tech University, to investigate genetic control and variation in flower abortion in soybean, we selected two contrasting accessions, PI567638 and PI506862, based on 2023 and 2024 field data (Figure 16). PI567638 (high abortion; HA) exhibited a high flower abortion rate of up to 70%, while PI506862 (low abortion; LA) showed a significantly lower rate of around 26%. Flower tissues at different developmental stages (buds, partially open flowers, fully open flowers, and post-anthesis flowers) were collected from both accessions under field conditions for RNA sequencing. Principal component analysis (PCA) of four replicates revealed a high degree of concordance between samples. Notably, the analysis showed distinct clusters, with flower buds and post-anthesis flowers grouping together, while partially open and fully open flowers formed a separate cluster, highlighting stage-specific transcriptomic profiles. The analysis identified 1,223 differentially expressed genes (DEGs) in buds, 1,220 DEGs in closed petals, 1,140 DEGs in open flowers, and 4,292 DEGs in dry flowers between the two genotypes. Genes associated with floral development (Figure 17) were predominantly upregulated in the low-abortion genotype, while genes negatively regulating floral development were highly expressed in the high-abortion genotype. Key genes regulating flower development and abortion include FLOWERING LOCUS C (FLC), MADS AFFECTING FLOWERING (MAF), TERMINAL FLOWER1 (TFL1), CDK-regulating FLOWERING LOCUS M, DAD1, AIPP3 (associated with flowering inhibition), ASP1, GI, AHL20, FLAVIN-BINDING, COL2, CONSTANS (CO), PRR5, BSMT1, AGAMOUS-Like 15 (AGL15), AGL20, and GA20OX2.

Furthermore, cluster analysis of DEGs identified five major clusters (Figure 18). Comparison between LA_Buds with HA_Buds, we identified significant upregulation in C3, wherein genes associated with Pectin catabolism were significantly upregulated. Pectin is an essential biomolecule that acts as structural component (glue) for cell adhesion. However, further exploration and validation of these RNA-Seq findings is required to understand the process of flower abortion in soybean.

View uploaded report PDF file

Key Findings

1. Genetic Diversity in Flower Abortion
• Over 50 soybean genotypes were studied under different irrigation regimes across four locations: Texas, Kansas, Missouri, and Tennessee.
• Genotypes such as PI556511, LG05-4464, and CL0J173-6-8 consistently showed lower flower abortion rates across multiple environments.
• Drought tolerance mechanisms, possibly through enhanced root growth, were observed in some lines, such as PI534648 and PI533654, enabling better flower retention under water stress.

2. Flowering Dynamics and Stress Response
• Greenhouse experiments revealed significant variation in flowering patterns between high and low flower abortion genotypes.
• Low flower abortion lines, such as PI506862, maintained higher flower counts even under severe drought stress.
• Node-specific flowering studies indicated that flower distribution patterns differ significantly between genotypes, with implications for breeding strategies.

3. Machine Learning for Field Phenotyping
• A machine learning model was further improvised to track flowers and pods, transitioning from single-class to two-class predictions to distinguish between new and old flowers.
• The SORT algorithm achieved the highest tracking accuracy (MOTA 0.964), ensuring reliable flower and pod counts.
• Initial results demonstrated alignment between manual and machine learning-based flower counts, validating the model’s potential for large-scale applications.

4. Pod Detection and Segmentation
• Advanced segmentation techniques, such as Mask R-CNN, were applied to accurately detect soybean pods, even under occlusions.
• Pod tracking models are being refined to reduce errors and improve counting accuracy, with promising preliminary results.

5. Molecular Insights into Flower Abortion
• RNA sequencing of contrasting genotypes (PI567638 and PI506862) identified over 4,000 differentially expressed genes (DEGs).
• Genes associated with floral development, such as FLC and AGL15, were upregulated in low-abortion genotypes, while stress-related genes dominated in high-abortion genotypes.
• Cluster analysis revealed significant upregulation of genes involved in pectin catabolism, suggesting potential molecular pathways influencing flower abortion.

6. Environmental and Agronomic Observations
• Irrigation mitigated heat stress impacts in Texas, maintaining yields despite temperatures reaching 100°F.
• Genotypes with lower lodging scores and higher seed weights were identified as suitable candidates for breeding.
• Variations in plant height across locations highlighted the influence of environmental factors on growth without significantly affecting yields.
• Based on 2023 findings, flower abortion and yield did not always exhibit a direct correlation. Some genotypes with high flower abortion still maintained yield levels comparable to low-abortion
genotypes, likely due to allocating more energy to flowering. Comparing 2023 and 2024 results will enhance our understanding of these patterns, aiding in the selection of genotypes with consistent low flower abortion and high yield potential for the 2025 trials.

Future Direction
The integration of genetic, phenotypic, and machine learning tools has provided significant insights into selecting low flower abortion in soybeans genotypes. Future work will focus on:
• Enhancing machine learning models for improved flower and pod counts to extract abortion rates, improve platform video imaging, and create a breeder friendly program for soybean flower and pod count.
• Validating candidate genes through gene editing to understand the mechanism of flower abortion in contrasting (high and low abortion) genotypes.
• Create new biparental mapping populations to map the dynamics of flower abortion in soybean.

Benefit To Soybean Farmers

Retaining even a proportion of 30% to 80% of flowers aborted under well-watered and stressful conditions, respectively, will allow for 10 to 20% increase in yield for the soybean producers in the US. This advantage can be extended to different soil and water available conditions, to support a wide range of soybean producers and is the major rationale for embarking on testing this hypothesis across four different soybean growing states with a focus on maturity groups III to IV. The advantage proposed through this collaboration, will allow the soybean producers to gain additional economic return at the same level of investment i.e., with same seed cost, fertilizer level and management. With changing climate leading to an increase in temperature and lesser water available scenarios, the proportion of flower drop would increase proportionally, further lowering yield and producer profits. Hence, germplasm, breeding populations, novel QTL/genes and CRISPR edited lines developed with increased flower retention would help enhance the yield potential under current climates and retain the advantage even under future warmer and drier environments.

The United Soybean Research Retention policy will display final reports with the project once completed but working files will be purged after three years. And financial information after seven years. All pertinent information is in the final report or if you want more information, please contact the project lead at your state soybean organization or principal investigator listed on the project.