Partners
EMBL, CRG, BSC, CSC
Challenge
Genetic data generated in a healthcare context is subject to more stringent information governance than research data and must adhere to national data regulations and laws. Using TREs is an accepted model for enabling secondary reuse of this data for research and is suitable for very large volumes of data. Within the European Genomic Data Infrastructure (GDI) and the Federated EGA (FEGA) projects, over twenty countries or nodes across Europe will be launching TREs for genomics data.
Genomic data is considered personal by GDPR and, hence, never anonymous. Real-world examples are lacking about using cross border TREs that include the ethical and legal aspects, together with the technical aspects of pseudonymity and anonymity by aggregation. TRE services for genetic data, which can be reused for other data types at volume, will require the connection of HPC resources.
Driver approach
This driver will reuse analysis workflows deposited at WorkflowHub, to process data served by FEGA members using Global Alliance for Genomics and Health (GA4GH) and other established standards. The development and testing phases would use synthetic, non-sensitive, data and would further be validated with sensitive data. The driver partners (BSC, CRG, CSC, EMBL-EBI) have extensive experience of sharing human genetic data and manage the federated model for discovery and access via FEGA which forms a core supporting infrastructure for the GDI.
The partners are leaders in the development and implementation of global standards for discovery and access of genomic data via the GA4GH. The driver partners (including many of the core project partners through ELIXIR) will take the baseline legal, operational and technical developments of the Federated EGA, and use them as input to the EOSC-ENTRUST architecture. Developments from EOSC-ENTRUST will be fed back to the GDI, to balance the domain-specific needs with interdisciplinary, EOSC-based standards.
Expected impact
The GDI nodes are provided with a working use case about how to become federated TRE providers that involves all legal, ethical and technical considerations. Technical solutions for managing data at a volume, leveraging existing standards would be provided. The federation of TREs is easily extended to nodes as they develop the infrastructure capabilities.