Drivers

EOSC-ENTRUST has four drivers, or use cases, which are prototypic for federated, multinational use of TREs in research practice across scientific domains and user communities.

Each driver highlights and validates specific aspects of the TRE functionality and represents different sectors (e.g. health, social sciences, public/private collaborations) which require secure and reliable sharing and analysing of sensitive data.

Driver 1

Driver 1

Demonstrate scalability and interoperability of the blueprint in a high data volume (genomics) network of local TRE nodes distributed across multiple countries.

Read more

Driver 2

Driver 2

Demonstrate the applicability of the blueprint across very heterogeneous scientific domains, e.g. social and life sciences.

Read more

Driver 3

Driver 3

Demonstrate the potential ability of the blueprint to bridge traditionally very separated data domains of clinical trials and real-world health data in one solution architecture.

Read more

Driver 4

Driver 4

Demonstrate the applicability of the blueprint beyond the academic context in a public-private network with SME providers.

Read more

Partners

EMBL, CRG, BSC, CSC

Challenge

Genetic data generated in a healthcare context is subject to more stringent information governance than research data and must adhere to national data regulations and laws. Using TREs is an accepted model for enabling secondary reuse of this data for research and is suitable for very large volumes of data. Within the European Genomic Data Infrastructure (GDI) and the Federated EGA (FEGA) projects, over twenty countries or nodes across Europe will be launching TREs for genomics data.

Genomic data is considered personal by GDPR and, hence, never anonymous. Real-world examples are lacking about using cross border TREs that include the ethical and legal aspects, together with the technical aspects of pseudonymity and anonymity by aggregation. TRE services for genetic data, which can be reused for other data types at volume, will require the connection of HPC resources.

Driver approach

This driver will reuse analysis workflows deposited at WorkflowHub, to process data served by FEGA members using Global Alliance for Genomics and Health (GA4GH) and other established standards. The development and testing phases would use synthetic, non-sensitive, data and would further be validated with sensitive data. The driver partners (BSC, CRG, CSC, EMBL-EBI) have extensive experience of sharing human genetic data and manage the federated model for discovery and access via FEGA which forms a core supporting infrastructure for the GDI.

The partners are leaders in the development and implementation of global standards for discovery and access of genomic data via the GA4GH. The driver partners (including many of the core project partners through ELIXIR) will take the baseline legal, operational and technical developments of the Federated EGA, and use them as input to the EOSC-ENTRUST architecture. Developments from EOSC-ENTRUST will be fed back to the GDI, to balance the domain-specific needs with interdisciplinary, EOSC-based standards.

Expected impact

The GDI nodes are provided with a working use case about how to become federated TRE providers that involves all legal, ethical and technical considerations. Technical solutions for managing data at a volume, leveraging existing standards would be provided. The federation of TREs is easily extended to nodes as they develop the infrastructure capabilities.

Partners

CESSDA, GESIS, UKDS, TARKI

Challenge

In many countries, data sharing in the social sciences is carried out in national data centres of excellence with mature TREs. Those TREs develop standards for interoperability to enable collaborative work using multiple TRE capabilities. However, these efforts face obstacles such as:

(1) Common terminology - many countries have developed their own administrative and social science database environments, which reflect the characteristics of data management in that country. To make these different approaches comparable, a set of definitions should be established

(2) Legislative and governance challenges - harmonisation of the differences arising from the legal environment and the conditions of use allowed by the data owner and regulations set by the institutional systems of public and private organisations.

Driver approach

CESSDA and affiliated national entities will provide their TRE capabilities and expertise to inform specifications (legal, operational and technical) and to identify and promote common standards to enable sensitive data sharing. A solid foundation has been laid by past and present trans-national projects, e.g. the Social Science and Humanities Open Cloud (WP5 of SSHOC), the International Data Access Network (IDAN) and the International Secure Data Access Facilities Professionals Network (ISDFPN).

Collectively, these projects developed a framework for implementing trans-national data sharing agreements and enabled new remote access connections. The aim is now to expand and test the framework with a new secure data service and look at how it can be extended to cross-domain data sharing and the sharing of novel forms of data, such as digital behavioural data or qualitative data.

Expected impact

CESSDA as an organisation will provide greater interoperability in social science resources, allowing collaborative projects at an international scale and at cross-domain level. The International Secure Data Facility Professionals Network (ISDFPN), set up under SSHOC and now jointly run by the UK Data Service and GESIS, will provide improved expert and sustainable support for data professionals working in TREs by providing additional resources.

Further, this network, which enables international dissemination of critical data sharing skills and knowledge, will be enabled to realise its aim to expand further.

Partners

ECRIN, UiO, HDR UK, UNIVDUN

Challenge

Clinical research data sharing and reuse is increasingly regarded as a key requirement for accelerating scientific discoveries and improving healthcare provision. When the scientific community has access to Individual Participant Data (IPD) that underlie research results, new analyses can be performed by other researchers with different ideas, and expertise and data can be pooled for meta-analysis to increase statistical power. Besides the IPD, other clinical research data sources should be made available for sharing (e.g., research protocols, clinical study reports, statistical analysis plans) to enable a full understanding of any dataset.

Despite a “call for action” from major stakeholders, the percentage of clinical research studies that make their IPD and associated files available for re-use is low, due to, among others, ethical, legal, organisational, semantic and technical barriers. In view of the European Health Data Space (EHDS), the secondary use of routinely collected health data for research purposes is increasingly gaining momentum in Europe and several national health data hubs are developed. Moreover, routinely collected health data (e.g., electronic health records, medical claims) is a vast data source that has the potential to inform the design and conduct of clinical research, especially with regards to patient recruitment, stratification, and outcome assessment.

Driver approach

To address this challenge, the European Clinical Research Infrastructure Network (ECRIN) has partnered with the University of Oslo to design and operate a TRE adapted to the specific needs of the clinical research community, building upon TSD, a sensitive data infrastructure in Norway, and ensuring compliance with European regulations and the GDPR. Driver 3 will provide the EOSC ENTRUST TRE network with minimum requirements (ethical, legal, organisational, semantic and technical) for clinical research data. The current model developed by ECRIN and TSD will be validated against the TRE blueprint and align its policies and procedures according to the outputs of WPs 5-6 to enhance interoperability among different TREs for clinical research.

EOSC-ENTRUST will provide input on the generation of a data quality stamp and the extension of the analytical tools within the TRE used by ECRIN and the University of Oslo. HDR UK will contribute to develop recommendations around minimum requirements for further interoperability with their routinely collected health data. There will also be an opportunity to explore use cases around the use of research data linked to routinely collected data and explore data governance and access solutions to enable use of linked data.

Expected impact

The adoption of the EOSC-ENTRUST architectural blueprint by TREs will provide the necessary requirements (ethical, legal, organisational, semantic and technical) to enable clinical research data sharing and re-use in a secure and legally and ethically compliant manner through the development of a common federated model sharing common workflows. In addition to improving the conditions and levels of clinical research data sharing, the involvement of HDR UK will enable demonstration of feasibility in:

● Using routine health data for supporting the planning and performance of data-enabled clinical trials.

● Developing predictive models and increasing knowledge about diagnosis, treatment and outcome by combining routine health data, observational studies and clinical trial data.

● Supporting systematic/scoping reviews and development of policy measures by combining different data types (routine health data, clinical trials, observational studies, cohorts, registers).

All material and resources produced through this work will be openly shared to inform policy and requirements to improve use of data for clinical research in a trustworthy manner.

Partners

Turku UAS, CSC and Sigma2

Challenge

Digital transformation has swept through industry and huge amounts of data requiring protection are currently gathered by companies and SMEs (e.g. through wearables). For instance, the wellbeing industry can collect data on their customers’ health, sport activities and daily habits, and smart cities studies can be used by construction companies and research institutions to design optimal housing solutions on the basis of individuals' data and habits.

These data are of interest to customers and companies, but also for advancing research and improving national wellbeing. However, practices, protocols, and digital solutions to enable use of data requiring protection (to comply with GDPR regulations, national privacy protecting legislation and business confidentiality) in the private sector are often lacking. In addition, efficient collaboration between private sector and research institutions requires secure digital workspaces and shared data management best practice.

Driver approach

Driver 4 will focus on research and innovation projects led by higher education and research institutions that processes data from their partners in the private sectors. Driver partners (Turku UAS, CSC and Sigma2) will elaborate a set of requirements for a technical platform that can support research institutions and organisations from the private sector to securely and jointly process data requiring protection (either for business confidentiality or GDPR and other privacy protection regulations).Their existing sensitive data services will be tested against the identified requirements to find the optimal blueprint allowing the management of data between public research and private sector while still ensuring protection and security.

In this driver, Turku UAS will lead the requirements gathering from the community among their collaboration network and provide the test data sets and research questions for data analysis. If required, Driver 4 will modify the TRE at partner CSC for the requirements of the public-private research project. Sigma2 will evaluate the proof-of-concept solution and its applicability from a national needs perspective. Turku UAS will validate the applicability of the results for their research network.

Expected impact

The EOSC-ENTRUST approach will be shown to work with private sector data. Input for blueprint and technology development will be provided from the viewpoint of public-private research collaborations, facilitating the use of valuable privately collected data related to research and innovation projects. Applicability of the blueprint for the use in public cloud environments will be considered. By contributing to the development of policy briefings and outreach (WP2) the driver will facilitate wider adoption of the blueprint at SMEs and smaller research institutions enabling a growing ecosystem.