Projects

Open Research Data Projects

Projects funded in the framework of the ORD Program

The joint ORD program of ETH Zurich, EPFL and the four research institutes of the ETH Domain has financially supported more than 60 research projects in the period 2020–2023. Funding supports researchers engaging in, or developing, ORD practices with and for their community and assists these researchers in becoming Open Research Data leaders in their field.

This page provides an overview of these projects. It highlights how researchers in the ETH Domain are currently applying ORD in exemplary ways. Some of the projects have already been completed, others are still in progress. The projects have been divided into three categories.

“Establish” projects help link existing ORD practices to a research agenda to establish them on a broader basis. They contribute to a shared and comprehensive understanding of ORD practices that can then become de facto standards.

“Explore” projects are the most extensive ventures in the program and are designed to explore and test early-stage ORD practices. The goal is to map processes of what an ORD practice might look like and develop prototypes. Through these projects, new teams form across disciplines and institutions.

“Contribute” projects help scientists integrate their research data into existing, often international, infrastructures. By standardizing the processes and making them generally accessible, the data are validated, and their potential is considerably expanded.

Filter

Category
Category
Institutions
MMS (Masonry MicroStructures database) - A 3D masonry microstructures database for advancing numerical research on irregular stone masonry structures

Category

Contribute

Institutions

EPFL

Data type

Microstructure database

Field

Materials Science

Researchers

Shah, Mati Ullah

Abstract

Stone masonry is an eco-friendly construction material, but its use has declined due to its vulnerability to earthquakes, mainly because of the poor arrangement of its microstructure. The microstructure includes the shape, size, and arrangement of stone units, which vary based on geographic, temporal, and material factors. Current building codes cannot fully account for this variability, and experimental studies are costly and impractical due to the diversity of masonry typologies. Numerical studies offer a solution, but creating realistic microstructures for modeling irregular stone masonry is complex and time-consuming. As a result, simplified microstructures are often used in simulations, which fail to capture the complexities of irregular masonry walls. To address this challenge, we have developed a 3D masonry microstructures database ready to use in numerical simulations. To enhance accessibility and usability, this project aims to create a web-based platform hosting this curated database of 3D microstructures and their geometric indices. The proposed web-based platform will also feature a tool for evaluating masonry quality using the Masonry Quality Index (MQI) from 2D images, promoting the preservation of historic structures and sustainable construction practices. Additionally, the platform will enable researchers to contribute and document new 3D microstructures, fostering collaboration and advancing numerical research on stone masonry.

Application Programming Interface for the River to Ocean Geodatabase for Education and Research

Category

Contribute

Institutions

ETH Zurich

Data type

Environnement

Field

Earth sciences

Researchers

Paradis, Sarah

Abstract

In order to advance our understanding of the carbon cycle, it is essential to evaluate the spatiotemporal variations of carbon between river and marine environments and gain insights into the pathways of carbon transfer from land to ocean. To do this, we need to work jointly with riverine and marine data, accounting for their temporal and spatial distribution. However, each of these systems have different data and metadata reporting strategies that need to be accounted for, which complicates their joint application. Efforts have been made to compile data from each of these systems into independent databases, but no attempt has yet been done to create a joint database of data of both of these systems while accounting for their different metadata. Hence, this project aims to bring together riverine and marine data into one database to easily query the data between both systems through the River to Ocean Geodatabase for Education and Research (ROGER). This database will be displayed in an interactive web-interface that queries riverine and/or marine data depending on the user’s requirements through a REST API. Harnessing the advanced geographical functions of PostgreSQL, the REST API will include functions that allow users to geospatially integrate riverine and marine data. This new database will provide a crucial step forward in the understanding of the carbon cycle along the land-ocean continuum, while ensuring that the data complies with best Open Research Data practices.

Development of standardized Respiratory Open Access Research

Category

Contribute

Institutions

EPFL

Data type

Medical data

Field

Life sciences

Researchers

Dan, Jonathan

Abstract

Chronic cough is a common condition globally. While efforts are being made to develop wearables to detect and quantify cough events automatically, such monitoring devices have not yet been incorporated into routine clinical practice due to a lack of consistency in their validation, resulting in slow progress and a lack of trust in reported results. We have identified three main reasons for this heterogeneity: 1) the clinical definition of different cough events and especially the delimitation of their beginning/end lacks standardization, 2) the data used is typically private and imbalanced with inadequate labelling as a result of the previous point, and 3) methodologies to assess the accuracy of event detection are different between research groups and often inappropriate. This proposal builds on ORD datasets, community guidelines, and standards to propose a unified framework for validating cough event detection algorithms. The main objective is the development of standards that will unify the workflow for validating respiratory event detection algorithms to ensure data adheres the principles of Findable, Accessible, Interpretable, and Reusable data. This will be distributed through a website, serving as a central hub and reference for standardizing clinical definitions and methodologies, leading to a future benchmarking platform for respiratory event detection algorithms.

Filter

Category
Category
Institutions
Airborne Laser Scanner Data Repository and Processing Portal for the ARES Observatory

Category

Explore

Institutions

EPFL

Data type

Laser scanner data

Field

Earth sciences

Researchers

Skaloud, Jan

Abstract

This project aims to establish an open database and processing pipeline for ALS (Airborne Laser Scanner) data, complementing the imaging spectrometry platform, to promote transparency, accessibility, reproducibility and innovation in sensor co-registration and data analysis, ultimately benefiting scientific communities worldwide

Integrated Instrument and Inventory Logbook for Experimental Science Labs

Category

Explore

Institutions

Empa

Data type

Electronic lab notebooks (ELNs)/lab information management systems (LIMSs)

Field

Experimental Science

Researchers

Schuler, Bruno, S. Bafelli (EMPA)

Abstract

Our aim is to develop practices and tools to trace and share the state of shared inventory items in a multi-user laboratory, linking this information to the personal Electronic Lab Notebook (ELN) of each user. With this practice, we seek to fill a big blank spot in the tracking of mutable laboratory information management system (LIMS) objects. A tool akin to a digital inventory logbook will be developed as an openBIS extension to support a wide range of experimental laboratories with variable types of measurement equipment and inventory items. This inventory logbook will complement the native openBIS ELN-LIMS, with focus on measurement equipment and inventory
management. The implementation will prioritize generality and user-friendliness to minimize adaptation barriers and promote the dissemination of this ORD practice.

Supporting open science with a plugin-based research platform for energy models

Category

Explore

Institutions

ETH Zurich

Data type

Energy System Model

Field

Energy engineering

Researchers

Schaffner, Christian

Abstract

The Nexus-e platform is a powerful tool for assessing the impacts of potential pathways for the Swiss energy system. This project aims to open-source both the model input data and code, adhering to FAIR Open Research Data and Open Science principles. The objective is to develop a modular framework allowing quick integration of new models and enabling easy execution of existing ones. Currently featuring five models, Nexus-e facilitates interdisciplinary research and policy analysis in energy systems. The project seeks to streamline the process of adding new models through a plugin architecture and implementing an API for standardized interaction with scenarios. By fostering collaboration and providing access to diverse input data, the project aims to enhance the usability and impact of Nexus-e within and beyond the ETH community, thereby advancing energy system research and supporting policy development.

Exploring and strengthening reproducible research practices in urban drainage

Category

Explore

Institutions

Eawag

Data type

Workflow Management System

Field

Urban studies

Researchers

Rieckermann, Jörg, HP. Leitão (EAWAG)

Abstract

To enhance the reproducibility of research practices within the urban drainage community, particularly focusing on improving the interpretability and reusability of both data and code, it is imperative to enhance the documentation of the origins of open datasets and the outcomes of workflows and models utilizing these datasets. Our goal is twofold: develop prototype Open Research Data (ORD) tools with the Swiss Data Science Center and assess their effectiveness with the international urban drainage community. We will explore if RENKU can serve as a comprehensive platform for this, given its features like collaborative workflow management, version control, and integration with data science tools, promoting reproducibility, and efficient collaboration among researchers. Planned use cases include i) individual researchers sharing results, ii) benchmarking rainfall-runoff models in our department, iii) and distributed groups providing pre-processed datasets with full provenance information. We will start by enhancing the FAIRness of a 20-year-old dataset on sewer mixing. Additionally, we will evaluate different EPA-SWMM model implementations and engage the international urban drainage community in ORD practices. This initiative could establish a cornerstone for data sharing in urban drainage, extending beyond Eawag's research.

OpenPulse: Assessing Open Science community metrics for Open Source Software

Category

Explore

Institutions

EPFL

Data type

Meta data

Field

Computer science

Researchers

Riba Grognuz, Oksana, O. Verscheure (EPFL)

Abstract

OpenPulse: Assessing Open Science community metrics for Open Source Software" aims to redefine the measurement of Open Science by focusing on Open Source Software (OSS) from EPFL. It addresses the limitations of current Open Science metrics, which primarily track Open Access publications, by proposing a new framework to evaluate the development of OSS and its community impact. This involves developing a tool, OpenPulse, to monitor OSS activities, establish reliable OSS datasets, and create visualizations for real-time impact assessment. The project emphasizes collaboration, community engagement, and the development of discipline-specific dashboards, aiming to foster a more inclusive and comprehensive understanding of Open Science's impact beyond traditional publications.

Open API and Interoperability in Medical Wearable Data for Healthcare Research

Category

Explore

Institutions

ETH Zurich

Data type

Medical device data

Field

Life sciences

Researchers

Paez, Diego, O. Stoller (ETHZ), C. Awai (ETHZ)

Abstract

The MED-WEAR project addresses the absence of data interoperability in wearable devices in clinical practice and research, i.e., each manufacturer, service provider and researcher, executes unique solutions for data capturing, storing, and formatting in each study. In response to this challenge, we propose the development of a Wearable API (MED-WEAR), which provides a standardised framework between medical wearables and robotic devices to collect data in multiple research and clinical facilities, whilst reducing the workload and costs by applying FAIR principles in clinical research with this devices. We aim to establish standardised data collection in the ETH domain and beyond, herewith, empowering the research in healthcare community for streamlined data collection with wearables to foster innovation with wearables and define open standards. MED-WEAR impacts clinical and data science research by enabling lifelogging for individuals, promoting transparency in patient monitoring across rehabilitation laboratories. Engaging with the Swiss Neuro Rehab
Initiative and collaboration across ETH RESC, RELAB, SMS lab, SCAI lab and DART lab at LLUI, the project establishes an interoperable platform, with the potential to provide a new ORD service within the ETH domain.

Building FAIRdata chains for atmospheric observations in the ACTRIS-Switzerland network

Category

Explore

Institutions

PSI

Data type

Atmospheric data

Field

Earth sciences

Researchers

Modini, Robin, B. Brem (PSI), M. Gysel-Beer (PSI), D. Bell (PSI), T. Bartels-Rausch (PSI)

Abstract

The OPEN-ACTRIS project aims to explore and build up FAIR data chain standards and strategies for atmospheric observations collected in Switzerland as part of the Aerosol, Clouds and Trace Gases Research Infrastructure (ACTRIS). ACTRIS is a pan-European network that aims to deepen our understanding of climate change and air pollution by producing high-quality data on short-lived atmospheric constituents. ACTRIS- Switzerland is the multi-institutional Swiss node of the network. ACTRIS has a comprehensive vision for FAIR data that covers all stages of the research data life cycle through the definition of data levels covering raw measurements, processed data, and elaborated data products. The OPEN-ACTRIS project aims to implement these ORD concepts in the ETH domain and to explore best ORD practice within ACTRIS-Switzerland by combining existing tools and infrastructure from the ETH domain and the ACTRIS community. We will achieve this by building FAIR data chains for the aerosol observations continuously recorded at field measurement stations on the Jungfraujoch and in Payerne, and for the aerosol and gas measurements performed on a campaign-basis in the PSI Atmospheric Chemistry Simulation Chambers

High-Throughput Chemistry Based Open Research Database

Category

Explore

Institutions

EPFL

Data type

Chemical database

Field

Chemistry

Researchers

Mieville, Pascal, O. Riba Grognuz (EPFL)

Abstract

Experimental data from chemical synthesis are complex, rarely openly available in a computational format, and mostly biased toward positive results, which represent a minority of cases. This situation strongly impacts the development of efficient predictive models in chemistry, drug discovery, energy storage or generation, and new materials development. To improve data quality and availability for the chemist community, the Swiss Cat+ West Hub and SDSC, with support from the SWITCH Foundation, propose to jointly develop HT-CHEMBORD. This project combines a global chemical synthesis robust and open ontology based on high quality FAIR compliant experimental data generated initially in the Swiss Cat+ hubs and then thanks to future collaborations by other high-throughput validated laboratories, an open access database with complete data integrity management and a set of query tools allowing the community of chemists and data scientists to explore the unique dataset offered. Exploratory work on the data validation strategy, with a view to extending it to external data providers, is already planned in the current project.

Starting a User Community for Cutting-Edge Sequence Search

Category

Explore

Institutions

ETH Zurich

Data type

Genomic data

Field

Life sciences

Researchers

Kahles, André

Abstract

The exponential growth of biomedical sequencing data has led to considerable challenges and open problems for genomic data management, leading to imitations in accessing and utilising this vast resource efficiently. The Sequence Read Archive (SRA) exemplifies the scale of available data, housing over 40 Petabases. However, the current indexing methods, which rely on metadata rather than full-text searches, significantly limit the potential for research and discovery. The Biomedical Informatics lab at ETH Zürich has developed a computational framework capable of indexing whole sequence repositories on a petabyte scale, compressing data significantly while maintaining search efficiency. This framework, embodied in the MetaGraph software platform, represents a major technological advancement, enabling precise, large-scale genomic data analysis. The lab has applied this framework to over 4 PB of raw sequencing data, freely sharing the generated indexes to promote open research. The proposal aims to establish MetaGraph as a leading open research data tool and to build a vibrant user community around it, enhancing accessibility and utility of genomic data. This initiative seeks to break down barriers to data access, fostering a more open, collaborative research environment, and expanding the scope of MetaGraph beyond DNA to include non-DNA repositories, addressing privacy and ethical considerations in data accessibility, and contributing to the democratisation of genomic data.

Advancing open geodata practices in research communities

Category

Explore

Institutions

ETH Zurich

Data type

Geospatial data

Field

Earth sciences

Researchers

Hurni, Lorenz

Abstract

The project aims to improve openness and interaction between research communities working with geospatial data. There is currently a significant gap in the absence of an application that enables research communities and Open Science stakeholders to publish, visualise, combine and extract research geospatial data in the formats desired by users, and to use them directly and openly in teaching and research. The project will focus on addressing key questions and working with research communities to better understand the needs and requirements of researchers for working with geospatial data in an open research data context. Key questions include the desired practices, data formats and standards for searching, combining, sharing and publishing open research geodata, and assessing the capabilities of existing geoportals such as GeoVITe to implement the developed ORD practices. Collaboration with the community, in particular with representatives of the geosciences, is essential to discuss and develop user-centred ORD practices. Participatory approaches aim to focus on user needs to make research geodata findable, accessible, interoperable and reusable in line with the FAIR principles. Based on the identified needs and processes, initial testing and technical implementation will be carried out on the portal. The long-term goal is to establish sustainable tools for the open research geodata community, based on existing open standards and an improved web-based geoportal.

Scroll to Top

Filter

Category
Category
Institutions