Coriolis data center

36
Distributed system development, including standards and interoperability By Dick M.A. Schaap – technical coordinator Bologna, September 08

Transcript of Coriolis data center

Page 1: Coriolis data center

Distributed system development, including standards and interoperability

ByDick M.A. Schaap – technical coordinator

Bologna, September 08

Page 2: Coriolis data center

SeaDataNet Infrastructure objectives

SeaDataNet aims to set up and operate an efficient Pan-European distributed infrastructure for managing marine and ocean data by connecting:

40 National Oceanographic Data Centres (NODC’s), national oceanographic focal points, and ocean satellite data centres, in Europethese Data Centres are mostly divisions of major national marine research institutes and based in 35 countries, surrounding the European seas

SeaDataNet aims to ensure a comparable quality of data sets and to make data sets easily accessible on-line through a unique portal, while the data sets are stored and managed at the Data Centres.

Page 3: Coriolis data center

SeaDataNet Infrastructure overall concept

Page 4: Coriolis data center

SeaDataNet Infrastructure JRA activities

JRA1: to define and develop common standards and protocols for SeaDataNet, that will the basis for interoperability inside the SeaDataNet network of Data Centres and outside to other systems.

JRA2: to design and develop the architecture and software components for the SeaDataNet infrastructure and to coordinate its implementation at the Data Centres and central SeaDataNet portal.

Close interaction between JRA1 and JRA2. The outcome of JRA1 also sets requirements towards JRA3: adapting the stand-alone ODV data analysis and viewing software package for integrated use with SeaDataNet output.

Because of this close relation the activities in JRA1, JRA2 and JRA3 are coordinated by a Technical Task Team.

Page 5: Coriolis data center

SeaDataNet Technical Task Team

Oversees and coordinates the technical developments

Includes 11 core technical partners

8 Meetings sofar:Paris, France 18-19 May 2006Hamburg, Germany 4-5 September 2006The Hague, The Netherlands 7-8 December 2006Trieste, Italy 21-22 March 2007 The Hague, The Netherlands 2-3 July 2007Nice, France, 30-31 August 2007Liverpool, UK, 18-19 December 2007 Paris, France, 13-14 May 2008

Page 6: Coriolis data center

SeaDataNet Technical Task Team

Methodology: Agenda with relevant topics, leading speakers, open discussion and brainstorming, possible subgroups for further explorations and analyses, resulting in technical specifications of system architecture and its modules, followed by implementation, evaluation and operation

All presentations and documents are included in the SeaDataNet Extranet in the TTT section. Note: All Project Partners are encouraged to visit this regularly.

All actions / decisions are minuted in a TTT Action List, which is updated regularly and also can be found in the Extranet TTT section

Page 7: Coriolis data center

SeaDataNet Infrastructure versions and planning

Version 0 = Continuation and maintenance of existing Sea-Search systems with minor modifications

Version 1 = Harmonised and upgraded metadatabases; Transparent data access involving all Data Centres. Prototype ready by March 2008; all TTT partners by end 2008; thereafter gradually migrating all partners – operational by mid 2009.

Version 2 = Adding OGC viewing services and further virtualisation of data access. Operational by 2010.

Page 8: Coriolis data center

SeaDataNet Version 0

Set-up of SeaDataNet portal (www.seadatanet.org) with CMS

Continuation of existing Sea-Search metadata systems:EDMED - European Directory of Marine Environmental Data SetsEDMERP – European Directory of Marine Research ProjectsCSR – Cruise Summary Reports EDIOS – European Directory of Ocean observing SystemsEDMO – European Directory of Marine OrganisationsCDI – Common Data Index

Operation of existing data access systems / procedures of all Data Centres, referred in the Common Data Index directory

Set-up of SeaDataNet extranet

Set-up of SeaDataNet mailing lists

Page 9: Coriolis data center

SeaDataNet Portal website

www.seadatanet.org

Page 10: Coriolis data center

SeaDataNet System Approach for V1 and V2

An approach has been adopted, which is in line with INSPIRE. The SeaDataNet infrastructure should consist of the following services:

Discovery services = Metadata directoriesSecurity services = Authentication, Authorization & Accounting (AAA)Delivery services = Data access & downloading of data setsViewing services = Visualisation of metadata, data and data products Product services = Generic and standard products Monitoring services = Statistics on usage and performance of the system Maintenance services = Updating of metadata by Data Centres

A network of interconnected Data Centres and a central Portal, that will give users access to the various SeaDataNet services, and information on data management standards, tools and protocols.

Page 11: Coriolis data center

SeaDataNet User’s portal architecture V1

services for metadataData downloading servicesVisualization services (WMS) for V2

Ifremer Database

BODC Database

BSH Database

...

Download managers in Data centers

CSR EDIOS

EDMED

CDI

Shopping basket

Requestsstatus

manager

datametadata

CSR

CDI

Project info

Metadata & Data catalogues

Data request Status of request

General request Metadata request

Organisation +data source id

Data download

at BODCat BSH at MARIS+ EDMERP

Entry point for access hits

EDMO

EDIOS EDMEDEDMERP EDMO

SoftwareVocabularies

Standards Cross search

AAA

Registr.My transact.

User Register

Userregistration

Page 12: Coriolis data center

SeaDataNet Interoperability

Interoperability is the key to distributed data management system success. This is achieved in SeaDataNet via:Using common quality control protocols and flag scale Using common and controlled vocabularies, including international content governanceAdopting the ISO 19115 metadata standard for all metadata directoriesProviding XML Validation Services to quality control the metadata maintenance Providing standard metadata entry toolsUsing harmonised Data Transport Formats (NetCDF, ODV ASCII and MedAtlas ASCII) for data sets delivery Adopting of OGC standards for mapping and viewing servicesUsing SOAP Web Services in the SeaDataNet architecture

Page 13: Coriolis data center

SeaDataNet Quality Control GuidelineA guideline (V1) of recommended QC procedures, reviewing NODC schemes and other known schemes (e.g. WGMDM guidelines, World Ocean Database, GTSPP, Argo, WOCE, QARTOD, ESEAS,SIMORC, etc.).

QC methods for CTD (temperature and salinity), Current meter data (including ADCP), Wave data, Sea level dataA scheme of QC flags to be used in SeaDataNet.These flags are for assigning to individual data values. They are not for allocating to whole data series, or to accompanying information.

Compiled in discussion with IOC, ICES and JCOM, to ensure an international acceptance and tuning. Important feedback from the joint IODE/JCOMM Forum on Oceanographic Data Management and Exchange Standards (January 2008), joined by SeaDataNet and international experts to consider on-going work on standards and to seek harmonisation, where possible.

Now extending the guideline with QC methods for surface underway data, nutrients, geophysical data, and biological data. V2: April 2009.

Page 14: Coriolis data center

SeaDataNet Common VocabulariesUse of common vocabularies in all metadatabases and data formats is

an important prerequisite towards consistency and interoperability.

Set-up and population of Common Vocabularies. The SeaDataNet Vocabulary service is based upon the NERC DataGrid (NDG) vocabulary Web service. For end-users there is a vocabulary Client Interface for searching and browsing and to export selected entries in csv format.

The Web service is compliant to WS Basis Profile 1.1, which is adopted as standard for all Web services in SeaDataNet.

Content governance of the vocabularies is very important and is done by a combined SeaDataNet and MarineXML Vocabulary Content Governance Group (SeaVoX), moderated by BODC, and including experts from SeaDataNet, MMI, MOTIIVE, JCOMMOPS and more international groups. SeaVox operates by mailing list server.

Page 15: Coriolis data center

SeaDataNet Common Vocabularies

Vocabularies User Interface http://seadatanet.maris2.nl/v_bodc_vocab/welcome.aspx

Page 16: Coriolis data center

SeaDataNet Common Data Transport Formats

V1: data sets are accessable by downloading services. Delivery of data sets to users requires common data transport formats, which interact with other SeaDataNet standards (Vocabularies, Quality Flag Scale) and analysis & presentation tools (ODV, DIVA)

The following formats have been defined:SeaDataNet ODV ASCII for profiles, time series and trajectoriesSeaDataNet MedAtlas as optional extra format.NetCDF with CF compliance for gridded data sets

ODV and MedAtlas have been outfitted with a SeaDataNet semantic header

International cooperation is underway from SeaDataNet with the CF community for a common NetCDF format for the oceanographic and meteorological domains, including a semantic header

Page 17: Coriolis data center

Authentication, Authorization and AdministrationSingle Sign On system required for access to distributed system

User’s authentication information based on personal login / password

User must register in order to get one loginWeb form to provide necessary information online user agreement on “SeaDataNet General Licence”After processing, login/password sent by email (email check)

Choice in V1 for CAS system (= Centralized)

SeaDataNet Data Policy

Authorisation based on “Roles”

Page 18: Coriolis data center

User registration and registration validation process

RegistrationWeb form

SDN licence agreement + User information

SeaDataNet web portal

1 2

ValidationWeb form

+ SeaDataNetrole

Validation of user registrationand SeaDataNet role assignment

SeaDataNet user directory

User

SDN User Desk

3

3

Transmission by email

User personal identifier (login) + password

Registration request

Validation

NODC of the user’s country or SDN User Desk (default)

User directoryupdate

Page 19: Coriolis data center
Page 20: Coriolis data center

SeaDataNet AAA service

Interface for central log-in

Page 21: Coriolis data center

SeaDataNet infrastructure V1Discovery services

EDMED - Data SetsEDMERP – Research ProjectsCSR – Cruise Summary Reports EDIOS – Monitoring systemsEDMO – Marine OrganisationsCDI – Common Data Index

PLUS

Cross search on top of the directories

Page 22: Coriolis data center

SeaDataNet Discovery servicesActivities undertaken for:

Reviewing and streamlining the logical formats of each of the Directories

Expanding the number of Common vocabularies, further population and upgrading of Vocabularies Web services

Defining XML schema’s and formats, using the ISO 19115 metadata standard as basis

Defining and developing maintenance modalities for each of the Directories

Page 23: Coriolis data center

SeaDataNet Discovery services

Defining and developing new User Interfaces for each of the Directories

Defining and implementing XML Validation Web services, that will be used to validate XML output from data centres, before import into the public Directories

Developing Web services for the Directories

Page 24: Coriolis data center

Vocabs

EDMO

……

CDI EDMERP

CSR

EDMED

EDIOS

SeaDataNet : Formats review Review of the format and use of common vocabularies for

each of the Directories to achieve harmonisation and integration, and paving the path to data access via the CDI

Data Access

Page 25: Coriolis data center

SeaDataNet XML Schema’s The ISO 19115 content model is the basis for the XML formats and exchange schema’s (XSD).

Guidance document has been prepared on how to use XML for SeaDataNet, including how to declare references to Common Vocabularies, EDMO and EDMERP

For each Directory has been prepared:Description of the format and XML tagsXML SchemaXML example file

Extended XML Schema’s have been prepared, using Schematron and OCL to support the checking of mandatory fields, use of codes from the Common Vocabularies and use of organization codes from EDMO. These schema’s are used in the XML Validation Web services

Page 26: Coriolis data center

SeaDataNet Maintenance modalities• Maintenance:

Depending on the Directory, the following maintenance modalities are provided:

Online maintenance via online Content Management System XML export from local system

Local XML export can be produced by partners via:own softwareusing the MIKADO Java tool, that has been developed for entering and editing partner’s shares of the V1 Directories. MIKADO interacts with the Web services of the vocabularies, EDMO and EDMERP and produces valid XML files, that can be imported into the central V1 Directories.MIKADO also includes functionality for coupling to partner’s local database(s) for generating CDI XML files in bulk

Page 27: Coriolis data center

SeaDataNet MIKADO Java tool Available under multiple environments :

Microsoft : Windows 2000, XP, VISTAAPPLEUnix - SolarisLinux

MIKADOJava code

Native DriversMYSQL

ORACLEPOSTGRESSQLServer

Bridge Driversusing Microsoft ODBC

(ACCESS, EXCEL, SQL SERVER)

DATABASE

JDBCJava DataBase Connectivity

EXCEL File

XML filesfor SeaDataNet

catalogues

CSREDMEDEDMERPCDI[EDIOS]

Manual

Automatic

Other DriversDownloaded from ad hoc Websites (Copied in the dist/lib MIKADO directory)

Page 28: Coriolis data center

SeaDataNet Maintenance modalities

Directory Online CMS by partners

XML exchange via use of MIKADO

atpartners

XML exchange

via local system

at partners

Online CMS bymoderator

XML Validationservice

EDMED(BODC =

authority)

SDN DC’s initial contentconversion freetext => vocabsvia ‘sandbox’

Stand-alone incl. local storage +QC – loop for contentSynchronisation

Generated from in-house system

Validation ofXML syntaxand use ofVocabs,

EDMOand EDMERP

EDMERP(MARIS =

authority)

SDN DC’s manage theirnational records +‘sandbox’ forinstitutes

Stand-alone incl. local storage

Generated from in-house system

Validation ofXML syntaxand use ofVocabs,

EDMOand EDMERP

CSR(BSH =

authority)

Online entries by chief scientists

Stand-alone incl.local storage

Generatedfrom in-housesystem

Validation ofXML syntaxand use ofVocabs,

EDMOand EDMERP

Page 29: Coriolis data center

SeaDataNet Maintenance modalities

Directory Online CMS by partners

XML exchange via use of MIKADO at partners

XML exchange via local system at partners

Online CMS by moderator

XML Validation service

CDI(MARIS = authority)

Tool embedded inlocal system

Generated from in-house system

Validation ofXML syntaxand use ofVocabs, EDMOand EDMERP

EDIOS(BODC = authority)

XML SCHEMA NOT YET READY

SDN DC’s initial revision + new entries via ‘sandbox’

Stand-alone incl. local storage

NOT YET READY

Generated from in-house system

NOT YET READY

Validation of XML syntax and use of Vocabs, EDMO and EDMERP

EDMO(MARIS = authority)

SDN DC’s manage their national records

Vocabularies(BODC = authority)

BODC with SeaVox governance mailing list

User Register + AAA service(IFREMER = authority)

SDN DC’s manage their national records

Page 30: Coriolis data center

SeaDataNet Content Management Systems

EDMERP

CSREDMED

EDMO

Page 31: Coriolis data center

SeaDataNet new User Interfaces

EDMERPCSR

EDIOS EDMO

Page 32: Coriolis data center

SeaDataNet Discovery services - statusReviewing and streamlining the logical formats of each of the Directories – READY

Expanding the number of Common vocabularies, further population and upgrading of Vocabularies Web services – READY

Defining XML schema’s and formats, using the ISO 19115 metadata standard as basis – READY, EXCEPT FOR EDIOS

Defining and developing maintenance modalities for each of the Directories:

EDMO, EDMERP, EDMED and CSR – READY VIA ONLINE CMSEDMED, EDMERP, CSR and CDI – READY VIA MIKADO UPGRADECommon Vocabularies and AAA - READY VIA MASTER CMSIMPORT OF XML FOR CDI READYIMPORT OF XML FOR EDMERP, CSR, and EDMED ALMOST READYEDIOS – VIA MIKADO AND XML IMPORT NOT YET READY, WAITS FOR XML SCHEMA

Defining and implementing XML Validation Web services, that will be used to validate XML output from data centres, before import into the public Directories – READY FOR CDI, EDMED, EDMERP and CSR

Page 33: Coriolis data center

SeaDataNet Discovery services - status

Defining and implementing new User Interfaces READY FOR EDMO, CDI, EDMERP, CSR, EDIOS, Common Vocabularies and AAA servicesUNDER DEVELOPMENT FOR EDMED CROSS SEARCH DEVELOPMENT STARTS WHEN ALL ARE READY

Content upgrading from V0 to V1 – Activity by all NODC’s EDMO READY EDMERP and CSR WELL UNDERWAY – DEADLINE NOV 08EDMED START VERY SOON – DEADLINE DEC 08 EDIOS – so far done for United Kingdom and underway for Black Sea countries; others waiting for MIKADO extension

Implementing Web services EDMO, EDMERP, Common Vocabularies, Validator and AAA services READY

Note: CDI upgrade is undertaken as part of the Data Access activities;

Page 34: Coriolis data center

SeaDataNet : URLsURL for CSR V1 Retrieval:

http://seadata.bsh.de/csr/retrieve/V1_index.html

URL for CSR V1 CMS

http://seadata.bsh.de/csr/online/V1_index.html---------------------------URL for EDIOS V1 Retrieval

http://seadatanet.maris2.nl/v_edios/search.asp

---------------------------URL for EDMED V1 CMS

http://seadatanet.maris2.nl/vu_edmed/welcome.asp

Page 35: Coriolis data center

SeaDataNet : URLsEDMERP Retrieval

http://seadatanet.maris2.nl/v_edmerp/search.asp

EDMERP CMS

http://seadatanet.maris2.nl/vu_edmerp/welcome.asp

------------------------

EDMO V1 retrieval

http://seadatanet.maris2.nl/edmo/

EDMO V1 CMS

http://seadatanet.maris2.nl/vu_organisations/welcome.asp

Page 36: Coriolis data center

SeaDataNet Delivery Services

See Separate presentation