Close Window

Enhancing Linkages Between Projects and Datasets: Examples from LBA-ECO for NACP

Lisa Wilcox, Science Systems & Applications, Inc. and the Carbon Cycle and Ecosystems Office, NASA Goddard Space Flight Center, lwilcox@pop900.gsfc.nasa.gov (Presenting)
Amy L. Morrell, Science Systems & Applications, Inc. and the Carbon Cycle and Ecosystems Office, NASA Goddard Space Flight Center, amorrell@pop900.gsfc.nasa.gov (Presenting)
Peter C. Griffith, Science Systems & Applications, Inc. and the Carbon Cycle and Ecosystems Office, NASA Goddard Space Flight Center, peter.griffith@gsfc.nasa.gov

The Carbon Cycle and Ecosystems Office is developing ideas to build a warehouse of metadata resulting from the North American Carbon Program. The resulting warehouse and applications would be very similar to those developed for the LBA-ECO Project (www.lbaeco.org). The harvested metadata would be used to create dynamically generated reports, available at www.nacarbon.org, which would facilitate access to NACP datasets. Our primary goal is to, as much as possible, associate harvested metadata with its corresponding project group profile. This also addresses high-priority goal #4 of the NACP Data System Task Force to "link the dataset metadata index with the project metadata index generated and maintained by the NACP Office"1. The benefit of achieving this goal will be the maximization of data discovery by association of each dataset with its corresponding NACP project group profile. This provides a greater understanding of the scientific and social context of each dataset. This will be challenging, because the datasets exist in many different formats, residing in many thematic data centers and also distributed among hundreds of investigators. Among other things, this situation creates a lack of consistency in how associated metadata is composed, thereby limiting our ability to fully automate metadata harvesting as well as dynamic generation of a wide variety of associated reports. Our presentation will give a brief technical overview of how we plan to harvest the metadata. We currently only harvest metadata that is in an XML format. However, not all NACP datasets have corresponding metadata in XML. Therefore, we will need to expand upon our current capabilities by creating harvest and ingest scripts that can extract metadata in other formats. We will also demonstrate what we can do for NACP by looking at what we have already done for LBA-ECO. For example, the LBA-ECO website (www.lbaeco.org) provides a profile (e.g. participants, abstract(s), study sites, and publications) for each LBA-ECO investigation. These profiles are very similar to the NACP project profiles. Linked from each profile is a list of associated registered dataset titles, each of which link to a dataset profile that describes the metadata in a user-friendly way. Moreover, each dataset profile contains hyperlinks to each associated data file at its home data repository and to publications that have used the dataset. We also use the harvested metadata from the LBA Project in administrative applications to assist quality assurance efforts. These include processes to check for broken hyperlinks to data files, automated emails that inform our administrators when critical metadata fields are updated, dynamically generated reports of metadata records that link to datasets with questionable file formats, and dynamically generated region/site coordinate quality assurance reports. These applications are as important as those that facilitate access to information because they help ensure a high standard of quality for the information. Where possible, we hope to create similar reports for NACP.


1 Prioritized list of recommendations to CCIWG regarding NACP Data Central, NACP Data System Task Force, July 24, 2006.

Presentation Type:  Poster

Abstract ID: 67

Close Window