22.03.2002M. Lautenschlager (M&D/MPIM)1 The CERA Database Michael Lautenschlager Modelle und Daten...

15
22.03.2002 M. Lautenschlager (M&D/MPIM) 1 The CERA Database Michael Lautenschlager Modelle und Daten Max-Planck-Institut für Meteorologie Workshop "Definition des Community- Klimadatenarchivs bei M&D und DKRZ" 26.+27.03.2002 in Hamburg

Transcript of 22.03.2002M. Lautenschlager (M&D/MPIM)1 The CERA Database Michael Lautenschlager Modelle und Daten...

22.03.2002 M. Lautenschlager (M&D/MPIM) 1

The CERA Database

Michael LautenschlagerModelle und Daten

Max-Planck-Institut für Meteorologie

Workshop "Definition des Community-Klimadatenarchivs bei M&D und DKRZ"

26.+27.03.2002 in Hamburg

22.03.2002 M. Lautenschlager (M&D/MPIM) 3

Local Access Problems

• Missing Data Catalogue• Directory structure of the Unix file system is not sufficient

to organise millions of files

• Data are not stored application-oriented• Raw data contain time series of 4D data blocks• Access is upon time series of 2D fields

Year 2001 2002 2003 2004 2005

Moderate Increase 210 TB 650 TB 1620 TB 2670 TB 3720 TB

Linear Increase 210 TB 1270 TB 4260 TB 7580 TB 10910 TB

22.03.2002 M. Lautenschlager (M&D/MPIM) 4

CERA Concept:Semantic Data Management

(I) Data catalogue and pointer to Unix files• Enable search and identification of data • Allow for data access as they are

(II) Application-oriented data storage– Time series of individual variables are stored as

BLOB entries in DB Tables• Allow for fast and selective data access

– Storage in standard file-format (GRIB)• Allow for application of standard data processing

routines (PINGOs)

22.03.2002 M. Lautenschlager (M&D/MPIM) 5

CERA Database: 7.1 TB (12.2001)* Data Catalogue* Processed Climate Data * Pointer to Raw Data files

Mass Storage Archive:200 TB neglecting Security Copies (12.2001)

CE

RA

Dat

abas

eS

yste

m

Web-Based User InterfaceCatalogue Inspection

Climate Data Retrieval

DK

RZ

Mas

s S

tora

ge A

rch

ive

In

tern

etA

cces

s

Current database size is 7.4365 Terabyte Number of experiments: 161 Number of datasets: 15503 Number of blobs in CERA at 26-FEB-02: 357216204

22.03.2002 M. Lautenschlager (M&D/MPIM) 6

CERA-2 Data Model• Complete with respect to IEEE’s Reference Model for

Metadata (Bretherton, 1994)– Browse, Search and Retrieval– Ingest, Quality Assurance, Reprocessing– Application to Application Transfer– Storage and Archive

• Includes international standards– Directory Interchange Format (NASA, 1998)– FGDC Metadata Content Standard (FGDC, 1996)– ISO Metadata Standard for Geographic Information (ISO 19115)

• Reference– “The CERA-2 Data Model” (DKRZ-Report No. 15, 1998)– URL: http://www.pik-potsdam.de/dept/dc/e/sdm/cera/

22.03.2002 M. Lautenschlager (M&D/MPIM) 7

CERA-2 Data Model

22.03.2002 M. Lautenschlager (M&D/MPIM) 9

Data Model Entries

• Requested information: W-W-W– What is stored?– Who is responsible?– Where are the data archived?

• Minimal information:– Name of entry (ENTRY.entry_name),– Keyword (GENERAL_KEY.general_key),– Time coverage (TEMPORAL_COVERAGE.start_year,

TEMPORAL_COVERAGE.stop_year),– Space coverage (SPATIAL_COVERAGE.min_lat,

SPATIAL_COVERAGEmax_lat, SPATIAL_COVERAGE.min_lon, SPATIAL_COVERAGEmax_lon).

22.03.2002 M. Lautenschlager (M&D/MPIM) 10

Data Model Functions

• The CERA2 data model …– allows for data search according to discipline, keyword,

variable, project, author, geographical region and time interval.

– allows for specification of data processing (aggregation and selection) without attaching the primary data.

– is flexible with respect to local adaptations and storage of different types of geo-referenced data.

– is open for cooperation and interchange with other database systems.

• But:– is not the simplest data model for each single application.

22.03.2002 M. Lautenschlager (M&D/MPIM) 12

User Interface

Signed Java Applet:• Catalogue Inspection• Climate Data Retrieval

http://mad.dkrz.de/java/CeraStart.html

22.03.2002 M. Lautenschlager (M&D/MPIM) 13

Variable

Keyword Projekt

Experiment

Dataset

Informationen

Contacts

Download

VariableVariable

Quantities

- General- Project- Quality

detailed Information

detailed Information

specific Dataset

detailed Information

- Quantities- temporal Structure- spatial Structure

- Investigator- Distibutor...

Search paths inCERA-Interface(Version 1.2.2)

22.03.2002 M. Lautenschlager (M&D/MPIM) 14

Data Structure in CERA

Level 1

Level 2

Experiment Description

Pointer toUnix-Files

Dataset 1Description

Dataset nDescription

BLOB DataTable

BLOB DataTable

22.03.2002 M. Lautenschlager (M&D/MPIM) 16

CERA Data Content• Climate Model Data

– Local climate model production experiments for present and future but also for past climates

• IPCC DDC (Data Distribution Centre)– Archive and dissemination of selected data from

international climate scenario calculations (IS92a and SRES)

• Project Support– Archive and dissemination of project data

• HOAPS (Hamburg Ocean Atmosphere Parameters and Fluxes from Satellite Data)

• CARIBIC (Civil Aircraft for Regular Investigation of the Atmosphere Based on an Instrumentation Container), MPI Mainz

22.03.2002 M. Lautenschlager (M&D/MPIM) 17

CERA Data Content

• Observational Data– Model related observations

• ERA15 (ECMWF)• NCEP/NCAR 40 Year Reanalysis• ERA40 in preparation

– Instrumental data • WOCE (World Ocean Circulation Experiment): field

measurements and products are transferred from BSH under development

– Earth observations• Access to SST's from NOAA AVHRR in cooperation with

DFD/DLR (distributed archive) under development

22.03.2002 M. Lautenschlager (M&D/MPIM) 18

CERA Support Levels• Level 1:

Data Catalogue + Pointer– Action required: Meta data entry, file pointer table and

agreement about long-term storage and accessibility– Meta data templates are available for ECHAM, MOZART,

REMO, HOPE

• Level 2: Level1 + application-oriented data storage– Action required: Level 1 + primary data processing to fit the

data model

• Information and Contact– URL: http://www.mpimet.mpg.de/Depts/MaD/– Mail: [email protected]

22.03.2002 M. Lautenschlager (M&D/MPIM) 19