The Linked Environments for Atmospheric Discovery (LEAD)1 2 project is pioneering new approaches for integrating, modeling, and mining complex weather data and cyberinfrastructure systems to enable faster-than-real-time forecasts of mesoscale weather systems, including those than can produce tornadoes and other severe weather. Funded by the National Science Foundation Large Information Technology Research program, LEAD is a multidisciplinary effort involving nine institutions and more than 100 scientists, students, and technical staff.
Foundational to LEAD is the idea that today’s static environments for observing, predicting, and understanding mesoscale weather are fundamentally inconsistent with the manner in which such weather actually occurs – namely, with often unpredictable rapid onset and evolution, heterogeneity, and spatial and temporal intermittency. To address this inconsistency, LEAD is creating an integrated, scalable framework in which meteorological analysis tools, forecast models, and data repositories can operate as dynamically adaptive, on-demand, Grid-enabled systems. Unlike static environments, these dynamic systems can change configuration rapidly and automatically in response to weather, react to decision-driven inputs from users, initiate other processes automatically, and steer remote observing technologies to optimize data collection for the problem at hand. Although mesoscale meteorology is the particular domain to which these innovative concepts are being applied, the methodologies and infrastructures are extensible to other domains, including medicine, ecology, hydrology, geology, oceanography, and biology.
The LEAD cyberinfrastructure is based on a service-oriented architecture (SOA) in which service components can be dynamically connected and reconfigured. A Grid portal in the top tier of this SOA acts as a client to the services exposed in the LEAD system. A number of stable community applications, such as the Weather Research and Forecasting model (WRF) 3, are preinstalled on both the LEAD infrastructure and TeraGrid 4 computing resources. Shell executable applications are wrapped into Web services by using the Generic Service Toolkit (GFac) 5. When these wrapped application services are invoked with a set of input parameters, the computation is initiated on the TeraGrid computing resources; execution is monitored through Grid computing middleware provided by the Globus Toolkit 6. As shown in Figure 1, scientists construct workflows using preregistered, GFac wrapped application services to depict dataflow graphs, where the nodes of the graph represent computations and the edges represent data dependencies. GPEL 7, a workflow enactment engine based on industry standard Business Process Execution Language 8, sequences the execution of each computational task based on control and data dependencies.