FAIRifying LéXPLORE: enhancing open research data pipelines for Advanced LaKe SciencE
Category
Institutions
Data type
Field
Researchers
Abstract
LéXPLORE platform (LP) is as an innovative, open-water infrastructure on Lake Geneva, where multidisciplinary data are acquired at high frequency. Datalakes (DL) is a web-based open access data platform that provides, for LP, seven datasets in real-time. We have identified two challenges that could limit the FAIR approach to the data. This project aims at addressing them by: a) enhancing the robustness of the LP data transfer pipelines and b) enhancing DL data quality by prototyping a collaborative QA/QC Solution. The first objective is to strengthen the LP data pipeline by decoupling the data gathering from the data processing functions. The resulting simplified distribution of responsibilities will facilitate long-term maintenance and secure the system's long-term reliability. The second objective aims at designing a prototype of a collaborative QA/QC tool. For this, we will collaborate closely with the LP community, by organising two workshops and a QA/QC hackathon. The developed prototype will allow domain experts to efficiently assess and flag data quality issues, in order to improve accuracy and reliability of sensor data. This project will involve the operational LP team, the DL core developers, and research software engineers from ENAC-IT4R. The proposed improvements will help to strategically maintain and elevate LP's high scientific impacts in the long-term. DL will continue to guide the lake research community towards embracing Open Research Data practices.