Open Research Data Projects
Projects funded in the framework of the ORD Program
The joint ORD program of ETH Zurich, EPFL and the four research institutes of the ETH Domain has financially supported more than 60 research projects in the period 2020โ2023. Funding supports researchers engaging in, or developing, ORD practices with and for their community and assists these researchers in becoming Open Research Data leaders in their field.
This page provides an overview of these projects. It highlights how researchers in the ETH Domain are currently applying ORD in exemplary ways. Some of the projects have already been completed, others are still in progress. The projects have been divided into three categories.
โEstablishโ projects help link existing ORD practices to a research agenda to establish them on a broader basis. They contribute to a shared and comprehensive understanding of ORD practices that can then become de facto standards.
โExploreโ projects are the most extensive ventures in the program and are designed to explore and test early-stage ORD practices. The goal is to map processes of what an ORD practice might look like and develop prototypes. Through these projects, new teams form across disciplines and institutions.
โContributeโ projects help scientists integrate their research data into existing, often international, infrastructures. By standardizing the processes and making them generally accessible, the data are validated, and their potential is considerably expanded.
Filter
Category
Institutions
Data type
Field
Researchers
Abstract
We established the openwashdata community for the Water, Sanitation, and Hygiene (WASH) sector. We built infrastructure and communication channels, taught 100 WASH professionals the basics of data science, developed a workflow to publish WASH data following FAIR data principles, and mobilized those in the sector who were interested in joining our vision and mission. Our next step is establishing a data stewardship network, actively working with strategic partners in Malawi and South Africa by placing a fully-funded data steward within a research institute and a non-governmental organization. A newly developed 12-module "data stewardship for openwashdata'' training programme will develop data management strategies and help our partners to institutionalize ORD practices long-term within their organizations. We will also further invest in the openwashdata publishing arm of the community by increasing the FAIRness of our data, critically analyzing how to better address the details of all four components of FAIR: Findability, Accessibility, Interoperability, and Reusability. We will also set up a governance structure and sounding board to ensure the long-term sustainability of the community. Through our activities and active open communications channels, we expect to create a demand for data stewardship in the WASH sector, assess their role and define a profile for data stewards more generally.
Category
Institutions
Data type
Field
Researchers
Abstract
This project aims to establish an open database and processing pipeline for ALS (Airborne Laser Scanner) data, complementing the imaging spectrometry platform, to promote transparency, accessibility, reproducibility and innovation in sensor co-registration and data analysis, ultimately benefiting scientific communities worldwide
Category
Institutions
Data type
Field
Researchers
Abstract
Our aim is to develop practices and tools to trace and share the state of shared inventory items in a multi-user laboratory, linking this information to the personal Electronic Lab Notebook (ELN) of each user. With this practice, we seek to fill a big blank spot in the tracking of mutable laboratory information management system (LIMS) objects. A tool akin to a digital inventory logbook will be developed as an openBIS extension to support a wide range of experimental laboratories with variable types of measurement equipment and inventory items. This inventory logbook will complement the native openBIS ELN-LIMS, with focus on measurement equipment and inventory
management. The implementation will prioritize generality and user-friendliness to minimize adaptation barriers and promote the dissemination of this ORD practice.
Category
Institutions
Data type
Field
Researchers
Abstract
The Nexus-e platform is a powerful tool for assessing the impacts of potential pathways for the Swiss energy system. This project aims to open-source both the model input data and code, adhering to FAIR Open Research Data and Open Science principles. The objective is to develop a modular framework allowing quick integration of new models and enabling easy execution of existing ones. Currently featuring five models, Nexus-e facilitates interdisciplinary research and policy analysis in energy systems. The project seeks to streamline the process of adding new models through a plugin architecture and implementing an API for standardized interaction with scenarios. By fostering collaboration and providing access to diverse input data, the project aims to enhance the usability and impact of Nexus-e within and beyond the ETH community, thereby advancing energy system research and supporting policy development.
Category
Institutions
Data type
Field
Researchers
Abstract
To enhance the reproducibility of research practices within the urban drainage community, particularly focusing on improving the interpretability and reusability of both data and code, it is imperative to enhance the documentation of the origins of open datasets and the outcomes of workflows and models utilizing these datasets. Our goal is twofold: develop prototype Open Research Data (ORD) tools with the Swiss Data Science Center and assess their effectiveness with the international urban drainage community. We will explore if RENKU can serve as a comprehensive platform for this, given its features like collaborative workflow management, version control, and integration with data science tools, promoting reproducibility, and efficient collaboration among researchers. Planned use cases include i) individual researchers sharing results, ii) benchmarking rainfall-runoff models in our department, iii) and distributed groups providing pre-processed datasets with full provenance information. We will start by enhancing the FAIRness of a 20-year-old dataset on sewer mixing. Additionally, we will evaluate different EPA-SWMM model implementations and engage the international urban drainage community in ORD practices. This initiative could establish a cornerstone for data sharing in urban drainage, extending beyond Eawag's research.
Category
Institutions
Data type
Field
Researchers
Abstract
OpenPulse: Assessing Open Science community metrics for Open Source Software" aims to redefine the measurement of Open Science by focusing on Open Source Software (OSS) from EPFL. It addresses the limitations of current Open Science metrics, which primarily track Open Access publications, by proposing a new framework to evaluate the development of OSS and its community impact. This involves developing a tool, OpenPulse, to monitor OSS activities, establish reliable OSS datasets, and create visualizations for real-time impact assessment. The project emphasizes collaboration, community engagement, and the development of discipline-specific dashboards, aiming to foster a more inclusive and comprehensive understanding of Open Science's impact beyond traditional publications.
Category
Institutions
Data type
Field
Researchers
Abstract
The MED-WEAR project addresses the absence of data interoperability in wearable devices in clinical practice and research, i.e., each manufacturer, service provider and researcher, executes unique solutions for data capturing, storing, and formatting in each study. In response to this challenge, we propose the development of a Wearable API (MED-WEAR), which provides a standardised framework between medical wearables and robotic devices to collect data in multiple research and clinical facilities, whilst reducing the workload and costs by applying FAIR principles in clinical research with this devices. We aim to establish standardised data collection in the ETH domain and beyond, herewith, empowering the research in healthcare community for streamlined data collection with wearables to foster innovation with wearables and define open standards. MED-WEAR impacts clinical and data science research by enabling lifelogging for individuals, promoting transparency in patient monitoring across rehabilitation laboratories. Engaging with the Swiss Neuro Rehab
Initiative and collaboration across ETH RESC, RELAB, SMS lab, SCAI lab and DART lab at LLUI, the project establishes an interoperable platform, with the potential to provide a new ORD service within the ETH domain.
Category
Institutions
Data type
Field
Researchers
Abstract
The OPEN-ACTRIS project aims to explore and build up FAIR data chain standards and strategies for atmospheric observations collected in Switzerland as part of the Aerosol, Clouds and Trace Gases Research Infrastructure (ACTRIS). ACTRIS is a pan-European network that aims to deepen our understanding of climate change and air pollution by producing high-quality data on short-lived atmospheric constituents. ACTRIS- Switzerland is the multi-institutional Swiss node of the network. ACTRIS has a comprehensive vision for FAIR data that covers all stages of the research data life cycle through the definition of data levels covering raw measurements, processed data, and elaborated data products. The OPEN-ACTRIS project aims to implement these ORD concepts in the ETH domain and to explore best ORD practice within ACTRIS-Switzerland by combining existing tools and infrastructure from the ETH domain and the ACTRIS community. We will achieve this by building FAIR data chains for the aerosol observations continuously recorded at field measurement stations on the Jungfraujoch and in Payerne, and for the aerosol and gas measurements performed on a campaign-basis in the PSI Atmospheric Chemistry Simulation Chambers
Category
Institutions
Data type
Field
Researchers
Abstract
Experimental data from chemical synthesis are complex, rarely openly available in a computational format, and mostly biased toward positive results, which represent a minority of cases. This situation strongly impacts the development of efficient predictive models in chemistry, drug discovery, energy storage or generation, and new materials development. To improve data quality and availability for the chemist community, the Swiss Cat+ West Hub and SDSC, with support from the SWITCH Foundation, propose to jointly develop HT-CHEMBORD. This project combines a global chemical synthesis robust and open ontology based on high quality FAIR compliant experimental data generated initially in the Swiss Cat+ hubs and then thanks to future collaborations by other high-throughput validated laboratories, an open access database with complete data integrity management and a set of query tools allowing the community of chemists and data scientists to explore the unique dataset offered. Exploratory work on the data validation strategy, with a view to extending it to external data providers, is already planned in the current project.
Category
Institutions
Data type
Field
Researchers
Abstract
The exponential growth of biomedical sequencing data has led to considerable challenges and open problems for genomic data management, leading to imitations in accessing and utilising this vast resource efficiently. The Sequence Read Archive (SRA) exemplifies the scale of available data, housing over 40 Petabases. However, the current indexing methods, which rely on metadata rather than full-text searches, significantly limit the potential for research and discovery. The Biomedical Informatics lab at ETH Zรผrich has developed a computational framework capable of indexing whole sequence repositories on a petabyte scale, compressing data significantly while maintaining search efficiency. This framework, embodied in the MetaGraph software platform, represents a major technological advancement, enabling precise, large-scale genomic data analysis. The lab has applied this framework to over 4 PB of raw sequencing data, freely sharing the generated indexes to promote open research. The proposal aims to establish MetaGraph as a leading open research data tool and to build a vibrant user community around it, enhancing accessibility and utility of genomic data. This initiative seeks to break down barriers to data access, fostering a more open, collaborative research environment, and expanding the scope of MetaGraph beyond DNA to include non-DNA repositories, addressing privacy and ethical considerations in data accessibility, and contributing to the democratisation of genomic data.