Data

Introduction

Earth system data is the foundation of learning how the world works. The multiple systems that are interacting require high flexibility to connect the different processes across the different systems. At this stage of our scientific endeavors, we are acquiring more data than we can use. As new data sets are being posted online in real time we need to be capable of managing the data stream to provide inputs to our analysis tools and to test our climate models. For each project, we need to understand the data itself so that we can develop and manage the data flow through the analysis process whether it is specifically for testing individual models or providing observations for the assessment of risk. Big data is now attached to massive data sets which are now reaching into the scale of exabyte storage. Our ability to navigate and use these exabyte data sets will require computing systems that can handle the data flow for individual projects. Working in the earth system area, our goal is to work with those data sets and deliver the required data for our collaborative projects.

Data

List of example data sets are :

  • Earth system simulations
  • Satellite data
  • In-situ data
  • Experimental platform data
  • Flux tower data sets
  • Metadata (description of the data, its providence, its format should always be self-describing formats)

Our goals are to provide guidance on how to use these exabyte scale data sets efficiently and to avoid wasted time in trying to access, download and use the date for specific projects. In connecting with the earth system modeling and earth system components, research teams will be developing new data sets that require data archiving and data distribution tools. We adhere to open source science and providing efficient access to earth system data is a major component of our work. As research teams develop new data sets, our center will help guide and develop these open-access data sets that will be posted and managed on data websites (google data, AWS data, DOE data, NASA data would be some examples).

Contact information

Chris Forest (ceforest@psu.edu)