Collaborative Problem Solving at Pacific Northwest National Laboratory
Home
Modeling Problem Solving

PSE Data Management

Climate Modeling PSEs

Distributed Queries

Information Services



ellipse

Information Services for Problem Solving Environments


An Information Service (IS) is a vital component to a distributed computing infrastructure, providing information on a wide range of community resources that includes people, computational machines, queues, software, and services.  Information about community resources transcends application boundaries, requiring application independent definitions and access.  An IS can also be used to host user and application specific resource information, such as user preferences and application defaults.  As shown in Figure 1, by providing access to this resource information, an IS enables infrastructure level services such as job script generation and metascheduling, and user-level services such as compute resource browsing and job launching.

Directory Services accessed via the Lightweight Directory Access Protocol (LDAP) have emerged as the predominate technology for information service implementations.  Several libraries (Netscape LDAP SDK, Java Naming and Directory Interface, perldap, etc.) are available for interfacing with LDAP directory servers from application software.  However, the use of these libraries require applications to interface at the level of the LDAP data model (entries and attributes), requiring the client to understand LDAP's organization of the data and thus making it costly to support other technology solutions.  Supporting a variety of technologies is an important consideration for a CPSE since it must be easily deployed in a variety of environments, utilizing existing infrastructure whenever possible to reduce installation and maintenance costs.

Our investigation into information services has focused on defining a lightweight service layer that hides the details of the underlying technology implementation and provides simplified object-level operations that fulfill the Collaborative Problem Solving Environment (CPSE) requirements.  This allows a CPSE installation to be tailored to a particular site.  For instance, it can be deployed for small sites (a few users and compute resources) that can be adequately served by a flat-file-based IS, or it can be deployed for an entire organization and utilize existing services (e.g., a standalone LDAP server or the MetaComputing Directory Service Version 2 (MDS-2)).  As a result, the same CPSE code base can seamlessly support various deployment options.  Figure 2 illustrates our planned IS architecture.  An HTTP-based service is used as the generic front-end to the IS.  The front-end service can connect to a variety of underlying IS implementations (e.g., Directory Services, RDBMS, flat files) and communicates using the native protocols and interfaces of these implementations (e.g., LDAP, JDBC, File I/O). Communication to the IS front-end is performed using the Simple Object Access Protocol (SOAP) which is supported over various transport protocols including HTTP.  The combination of HTTP and SOAP not only provides implementation independence from the underlying IS, but also platform and programming language independence.  These are important concerns for deploying a general service in scientific computing environments where heterogeneous computing systems exist and preferred programming languages vary by scientific community.

The SOAP interface that we have defined for the IS is object based and supports operations to save, modify, delete and retrieve objects.   Objects are transmitted to and returned from the IS as XML document fragments as defined by the Document Object Model.  Since XML is used for transmitting objects, the interface provides data format independence from the application programming language and also from the underlying service implementation.  Because objects are treated opaquely by the SOAP interface, new definitions and extensions to existing definitions can be created without requiring interface modifications.  Schema definitions for our objects are created in XML using a schema definition language derived from the Directory Services Markup Language (DSML) effort.  We tailored the DSML schema definition language to make it less directory-specific and to maintain compatibility, through simple translation, with the Grid Object Specification Version 1.0.  Defining our schema in XML provides access to a wide variety of programming language toolkits that simplify the translation of the schema into service-specific schema (e.g., Netscape Directory Server, OpenLDAP) and the automated generation of object-based software.

This work has made significant progress in providing a simple abstraction for object-based access to implementation-independent information services.  To further abstract the client application from the IS back-end, we are also investigating both schema and query translation services.

Currently, Ecce uses a flat-file-based IS, but an upcoming release will implement an interface to the SOAP/HTTP front-end service, providing installations that can be easily configured for different IS implementations.
 

Pacific Northwest National Laboratory is operated by Battelle for the U.S. Department of Energy.

For information about Collaborative Problem Solving Environments at PNNL, please contact Deborah Gracio at (509) 375-6362 or debbie.gracio@pnl.gov.

Security & PrivacyWebmaster
Reviewed: August 18, 2000