Skip to Main Content U.S. Department of Energy
Computational Sciences & Mathematics

Scientific Data Management Group

Advances in observational, experimental and computational technologies have led to an exponential increase in the volumes, variety and complexity of scientific data, and whilst data deluge might not be happening everywhere in an absolute sense, it does in a relative one for most research groups. This exceptional growth in data volumes and complexity has presented researchers with significant challenges, foremost how to manage, analyze and correlate the results of their research effectively.

The PNNL Scientific Data Management Group offers innovative solutions to enable effective research in these data intensive environments, by engaging in leading edge research and development in data quality management, metadata, provenance, data analysis and integration, semantic technologies and knowledge management. The group focuses on integrated solution design, tightly embedded into the scientific research processes, working closely with computational and experimental researchers in climate science, earth science, chemistry, systems biology, high energy physics and energy, as well as the law enforcement and intelligence community.

Next to research and development the group operates highly distributed, data management system services such as the Data Management Facility of the Atmospheric Radiation Measurement Climate Research facility for the past 25 years. The PNNL Scientific Data Management Group currently has 30 members on staff.

  • The Scientific Data management Group is pleased to announce the upcoming publication of Data Intensive Science edited by Kerstin Kleese van Dam and Terence Critchlow. Bringing together leaders from multiple scientific disciplines, Data-Intensive Science shows how a comprehensive integration of various techniques and technological advances can effectively harness the vast amount of data being generated and significantly accelerate scientific progress to address some of the world's most challenging problems. The book will be published on May 31st 2013 by CRC press, save 20% on Data-Intensive Science by entering promo code 193CM at checkout at CRC Press.
  • Data Intensive Science is the latest book in a series of publications from PNNL in this domain: Data-Intensive Computing: Architectures, Algorithms, and Applications, Editors J Gorton and D Gracio, 2012 - This reference for computing professionals and researchers describes the dimensions of the field, the key challenges, the state of the art, and the characteristics of likely approaches that future data-intensive problems will require. Chapters cover general principles and methods for designing such systems and for managing and analyzing the big data sets of today that live in the cloud, and describe example applications in bioinformatics and cyber security that illustrate these principles in practice. Amazon.
  • K. Kleese van Dam, D, Li, S. Miller, J. Cobb, M. Green, C. Ruby. Challenges in Data Intensive Analysis at Scientific Experimental User Facilities, Handbook of Data Intensive Computing, editors Borko Furht and Amando Escalante, Springer, chapter 10, 2011 Springer.

  • POC:


    RSS RSS Feed

    Latest News


    Projects & Publications


    SDM Staff
    ARM Data Integration

    ACMDD Research

    Research highlights

    View All ACMDD Highlights