The proposed project has five major components:
(1) Data collection via surveys. Data collection is a key component of the proposed project. Based on our previous experience working with farmer self-reported data, a large number of reporting fields, which are well distributed across soybean production regions, are needed to account for large management/soil/weather variability. Collaborators in each state working with graduate and undergraduate students and technicians will be responsible for collecting the field level data. Requested information will include yield (captured by yield monitor), field location, and detailed information on crop/field/input management, such as planting date, soybean variety, tillage method. A primary focus, which is novel from previous studies, will be on pest management decisions and costs of major inputs (seed, fertilizer, foliar product + application cost etc.).
(2) Data collection via in-season scouting. Boots on the ground field scouting in a subset of soybean fields (minimum of 8-10 different farmers; can have multiple fields per farm if they are willing) selected from diverse environmental regions (TEDs) will take place on a weekly basis. Selected fields will be chosen specifically to allow for large environmental variability and expected pest pressure. Efforts will occur on an annual basis and continue for all three years of the project in order to examine year-to-year weather and pest presence variation. Individual field data and farmer contact information will be kept strictly confidential.
(3) Data assimilation. Collected data will be standardized into a single, consistent format, error-checked, and then inputted into a digital database. We will also retrieve soil data for each individual field (using its GPS coordinates) from readily available websites (e.g., USDA-NRCS SSURGO database). Daily weather data and satellite imagery will be collected. Ultimately, for each field-year data point supplied by an individual farmer, we will have a detailed description of weather, soil, management, and health progress (via the use of in-season reflectance data and scouting), which, when taken together, will help us to identify management factors that affect yield. Ultimately, by accounting for input costs, analyzing the data will allow us to identify opportunities for improving profitability regionally and more importantly, at and across the field level.
(4) Data analysis. We will use well-known methods to analyze the data and new techniques for information extraction. We will base our approach on the use of already developed robust protocols that have resulted in a new online decision tool (see screenshot of the user-friendly interface below) which was developed by Agstat (https://www.agstat.com/cropping-system-optimizer/), a private company that we will collaborate with in the proposed work. However, according to the developer, it uses algorithms trained on “synthetic and simulated datasets” which may not align with what farmer’s experience in their fields. At the moment, no in-season crop health information is considered nor does above model provide pre-or post-plant in-season pest pressure risk alerts. Therefore, a similar tool will be developed using farmer’s data and will be providing in-season recommendations, which we believe can result in increased accuracy and profit optimization compared to tools and products that currently exist. Given that the collected data will not have the attributes of traditional replicated field trials, where we can control potential confounding factors by replicated blocking, a combination of novel statistical approaches (e.g., multilevel models, repeated measures for spatial and temporal correlation) and machine learning algorithms (e.g., convolutional and recursive neural networks) will be used to analyze the data. We have worked with these methods to explore complex data sources, in particular to examine interactions among different variables which cannot be easily explored in traditional statistical methods (Mourtzinis et al., 2021-in review; Shah et al., 2021-in preparation). The methods we propose to use are also being applied by the industry as part of their digital agriculture programs, but our approach and analysis will differ because it will be based on the aggregated farmer database. The product of this work will be a tool (combination of sequential models and algorithms) that will have the potential to generate field-specific recommendations and early-season pest risk alerts using farmer’s data. The outcomes will provide information for optimal cropping systems and management practices that can increase farm profitability at the field level.
(5) Communication and dissemination of results. Due to the large variability in management practices and costs among different farms, results will be communicated with farmers following a 2-stage approach. In the first stage (at the end of the second year), we will perform a large-scale profit optimization across all fields and years in the database. The objective will be to identify the optimum management practices for increased profitability and report the estimated profit difference across the NC US. In the second stage (during the last growing season), a subset of farms will be selected to demonstrate the potential of the developed tool to increase farm profitability compared to what farmers typically use. We will work with our collaborators in every state to identify the demonstration farms. All results from this project will be disseminated to producers and public via peer-reviewed scientific and extension publications, presentations at scientific conferences and extension events sponsored by universities, natural resources districts, grower associations, and proprietary organizations that market their products to soybean producers.