Information Services for Problem Solving Environments
An Information Service (IS) is a vital component to a distributed computing infrastructure, providing information on a wide range of community resources that includes people, computational machines, queues, software, and services. Information about community resources transcends application boundaries, requiring application independent definitions and access. An IS can also be used to host user and application specific resource information, such as user preferences and application defaults. As shown in Figure 1, by providing access to this resource information, an IS enables infrastructure level services such as job script generation and metascheduling, and user-level services such as compute resource browsing and job launching.
Directory Services accessed via the Lightweight Directory Access Protocol (LDAP) have emerged as the predominate technology for information service implementations. Several libraries (Netscape LDAP SDK, Java Naming and Directory Interface, perldap, etc.) are available for interfacing with LDAP directory servers from application software. However, the use of these libraries require applications to interface at the level of the LDAP data model (entries and attributes), requiring the client to understand LDAP's organization of the data and thus making it costly to support other technology solutions. Supporting a variety of technologies is an important consideration for a CPSE since it must be easily deployed in a variety of environments, utilizing existing infrastructure whenever possible to reduce installation and maintenance costs.
Our investigation into information services has focused on defining
a lightweight service layer that hides the details of the underlying technology
implementation and provides simplified object-level operations that fulfill
the Collaborative Problem Solving Environment (CPSE) requirements.
This allows a CPSE installation to be tailored to a particular site.
For instance, it can be deployed for small sites (a few users and compute
resources) that can be adequately served by a flat-file-based IS, or it
can be deployed for an entire organization and utilize existing services
(e.g., a standalone LDAP server or the MetaComputing Directory Service
Version 2 (MDS-2)). As a result, the same CPSE code base can seamlessly
support various deployment options. Figure 2 illustrates our planned
IS architecture. An HTTP-based service is used as the generic front-end
to the IS. The front-end service can connect to a variety of underlying
IS implementations (e.g., Directory Services, RDBMS, flat files) and communicates
using the native protocols and interfaces of these implementations (e.g.,
LDAP, JDBC, File I/O). Communication to the IS front-end is performed using
the Simple Object Access Protocol (SOAP) which is supported over various
transport protocols including HTTP. The combination of HTTP and SOAP
not only provides implementation independence from the underlying IS, but
also platform and programming language independence. These are important
concerns for deploying a general service in scientific computing environments
where heterogeneous computing systems exist and preferred programming languages
vary by scientific community.
The SOAP interface that we have defined for the IS is object based and supports operations to save, modify, delete and retrieve objects. Objects are transmitted to and returned from the IS as XML document fragments as defined by the Document Object Model. Since XML is used for transmitting objects, the interface provides data format independence from the application programming language and also from the underlying service implementation. Because objects are treated opaquely by the SOAP interface, new definitions and extensions to existing definitions can be created without requiring interface modifications. Schema definitions for our objects are created in XML using a schema definition language derived from the Directory Services Markup Language (DSML) effort. We tailored the DSML schema definition language to make it less directory-specific and to maintain compatibility, through simple translation, with the Grid Object Specification Version 1.0. Defining our schema in XML provides access to a wide variety of programming language toolkits that simplify the translation of the schema into service-specific schema (e.g., Netscape Directory Server, OpenLDAP) and the automated generation of object-based software.
This work has made significant progress in providing a simple abstraction for object-based access to implementation-independent information services. To further abstract the client application from the IS back-end, we are also investigating both schema and query translation services.
Currently, Ecce uses
a flat-file-based IS, but an upcoming release will implement an interface
to the SOAP/HTTP front-end service, providing installations that can be
easily configured for different IS implementations.

