Skip to Main Content U.S. Department of Energy
Science Directorate
Page 222 of 278

Advanced Computing, Mathematics and Data
Research Highlights

July 2009

New Workflow Approach Enables Development of Flexible Algorithms

Solution could greatly simplify the construction of workflow applications

Combining MeDICi with the Kepler workflow enables researchers to focus on scientific discovery
Combining MeDICi with the Kepler workflow enables researchers to focus on scientific discovery. Enlarge Image.

Results: Scientists at Pacific Northwest National Laboratory have demonstrated a proof-of-concept workflow that enables scientists to focus on algorithm development and scientific discovery.  This prototype was successfully demonstrated for an atmospheric sciences application.

Why it matters: Scientists rely on workflow applications to conduct their research. These applications are complex, involving many steps and considerable heterogeneity in software used and execution platforms. Coupled with that is the rapid growth of data sets that must be processed, requiring workflows to be regularly modified to scale to new data and processing challenges.  The success of PNNL's workflow prototype means less time from concept to implementation, and provides improved opportunities for sharing research results. 

Methods: The prototype combines PNNL's Middleware for Data Intensive Computing—called MeDICi—with the Kepler workflow environment, developed by DOE's Scientific Discovery through Advanced Computing Scientific Data Management Center.  

The workflow was implemented for a complex application in DOE's Atmospheric Radiation Measurement (ARM) Program.  It includes a framework that allows scientists to visually create and modify an individual pipeline for processing data as well as support for combining scientific algorithms, tools, and libraries.

The application also shows how MeDICi's broad integration capabilities complement the Kepler workflow tools.  The result promotes a strong separation of concerns, simplifying the Kepler workflow description and promoting the creation of a reusable collection of components available for other workflow applications in this domain.

What's next: Researchers will further investigate workflow solutions built using Kepler and MeDICi. The ARM Program is rich with workflow applications (VAPs). A solution which significantly promotes modifiability and reusability and provides a platform for scalability in terms of processing and data size is attractive.  Researchers also will benchmark the solutions and extend the designs to automate the executions of inter-related VAPs that run on distributed compute resources across North America.

Acknowledgment:  Kepler Open Source Community

Sponsors:  DOE SciDAC Scientific Data Management Center; PNNL Data Intensive Computing Initiative; DOE Atmospheric Radiation Measurement Program


Page 222 of 278

Science at PNNL

Core Research Areas

User Facilities

Centers & Institutes

Additional Information

Research Highlights Home

Share

Print this page (?)

YouTube Facebook Flickr TwitThis LinkedIn

What is a workflow?

Many scientific applications are typically structured as pipelines, or workflows, comprising a number of distinct computations.  In general, workflow applications gather data sets from one or more data source, transform the data into a format amenable for processing, analyze the data to produce useful results, and store the data in a repository where scientists can access it.  Many of the steps in the processing and the data sets that are accessed are distributed across different execution sites, requiring data to be moved across the network for subsequent processing by the next steps in the workflow.

Contacts