From disparate disciplines to unity in diversity How the PARTHENOS project has brought European humanities Research Infrastructures together

Since the first ESFRI roadmap in 2006, multiple humanities Research Infrastructures (RIs) have been set up all over the European continent, supporting archaeologists (ARIADNE), linguists (CLARIN-ERIC), Holocaust researchers (EHRI), cultural heritage specialist (IPERION-CH) and others. These examples only scratch the surface of the breadth of research communities that have benefited from close cooperation in the European Research Area. While each field developed discipline-specific services over the years, common themes can also be distinguished. All humanities RIs address, in varying degrees, questions around research data management, the use of standards and the desired interoperability of data across disciplinary boundaries. This paper sheds light on how cluster project PARTHENOS developed pooled services and shared solutions for its audience of humanities researchers, RI managers and policy makers. In a time where the convergence of existing infrastructure is becoming ever more important – with the construction of a European Open Science Cloud as an audacious, ultimate goal – we hope that our experiences inform future work and provide inspiration on how to exploit synergies in interdisciplinary, transnational, scientific cooperation.


Statement figures
You will find the figures along with their references in the file entitled: IJHAC_proposal_for_article-PARTHENOS-final_draft-figures.docx

Shared challenges in the humanities field
Innovative research is best served by a climate which supports interdisciplinary and transnational approaches.No matter whether someone studies Franconian languages, Anglo-Saxon archaeological remains or World War One photography; to successfully cover an interdisciplinary research question, one needs to be able to combine a wide array of source material.At the same time, a significant amount of research data and archival objects were conceivedand are still, to some extent, confinedwithin national and disciplinary boundaries; be it transcripts of parliamentary debates, civil records or newspapers.More recently, the phenomenon of proprietary data started to constitute additional technical restrictions.
To address this situation, a global trend towards integration of (cyber)infrastructure is taking place.While early examples of the application of computational methods in the humanities date back to mid-twentieth century, the amount of support that large-scale digital infrastructure have provided since the start of the twenty-first century has been remarkable and unprecedented. 1 This support takes many different forms and shapes; at the local, the national and the international level.Locally, humanities "labs" quickly became ubiquitous as spaces where research questions, technical expertise and physical infrastructure entwine.Such environments can be found both in academia and in the GLAM (Galleries, Libraries, Archives and Museums) field. 2 At the national level, opportunities to combine data, tools and services are also exploited.
In the United States, the seminal policy document Our cultural commonwealth (…) gave nationwide impetus to the digital advancements of the humanities 2006 onwards. 3is article is written from the perspective of a transnational Research Infrastructure (RI).In Europe, the task to coordinate the integration of scientific knowledge and expertise was assigned to the European Strategy Forum for Research Infrastructures (ESFRI).The establishment of this body was envisioned as a first step in the creation of a European Research Area. 4 In the first ESFRI roadmap, released in the same year as Our cultural commonwealth (…), the challenge laid out for the humanities was defined as follows: 'The present major task is (…) to create pan-European infrastructural systems that are needed by the social sciences and humanities to utilise the vast amount of data and information (…) in Europe'. 5The concept of an 'RI' in this article, stems from this policy definition.
Thirteen years later, it is fair to say that European RIs have been prominent in the humanities, and they only grew in size and relevance.They pool data and expertise, exchange knowledge and, consequently, enable innovation in their fields.Between 2006 and now, a significant number of projects was initiated.The first ESFRI roadmap already mentioned DARIAH and CLARIN, 6 respectively supporting the humanities at large and language-related studies.CENDARI encouraged research in its two-pilot historic periods (the Middle Ages and the First World War) and made archival descriptions from all over Europe available in one archival directory. 7Dispersed Holocaust sources were brought together in one portal by EHRI while, at the same time, the RI built a 'human network', fostering cooperation among Holocaust researchers. 8Archaeologists organised themselves in ARIADNE and IPERION-CH opened up services and facilities to those focussing on the restoration and conservation of cultural heritage. 9

From shared challenges to joint solutions
While this success of humanities RIs resulted in a wealth of aggregated research assets, it was felt that the risk of creating silos deserved attention.This was demonstrated in the results of a survey among 110 research institutions across Europe, which highlighted that 'The major challenge of a collaborative and connective pan-European research programme will be to harmonise digital research practices by drawing together the numerous national and, increasingly, multilateral digital research initiatives'. 10At the same time, it became apparent that different disciplines struggled with similar challenges.Researchers from all of the fields above experienced at least some difficulties in distinguishing which policies apply to them, choosing the right standards for structuring and storing data, and making sure these data are interoperable by design.
To develop solutions to both this quantitative 'digital data deluge' and to these shared challenges, the project PARTHENOS (which stands for: Pooling Activities, Resources and Tools for Heritage E-research Networking, Optimization and Synergies) was conceived. 11The project actively addressed these issues by developing products and services for the study of history; language-related studies; archaeology, heritage & applied disciplines; and, to a lesser degree, the social sciences.Examples of project output are a research scenario-guided tool to introduce humanities scholars to standards, a project-specific semantic mapping entitled the PARTHENOS Entities Model and a set of guidelines to make research data reusable.
As RIs are often built either at a disciplinary level (such as CLARIN, DARIAH etc.) or at a fundamental level (Géant, EGI, OpenAire), opportunities for efficiencies and knowledge sharing at the meso-level were at the risk of being missed out in this emergent landscape.This paper provides insight in the challenges that underlie a cluster project which brought humanities RIs together.
After briefly introducing the project and its requirements-based approach, this paper demonstrates how PARTHENOS aimed to create synergies around policies, standards, interoperability, training opportunities and project communication respectively.The authors will conclude this article by reflecting on both the successes and the challenges that PARTHENOS has faced in the light of its innovative potential, its transnational and interdisciplinary ambitions and the sustainability of its results in the wider digital humanities ecosystem.

The PARTHENOS project
Since its inception in 2015, the PARTHENOS project brought together major European integrating initiatives.Besides the durable infrastructures CLARIN and DARIAH, it also included the projects CENDARI, ARIADNE, EHRI and IPERION-CH.The collaboration took place within a so-called cluster scheme, introduced by the EU Horizon 2020.Despite the project ending in October 2019, the stewardship over assets that were created during the project was divided among partners, warranting their sustainability for the future.
It deserves mentioning that PARTHENOS' role as an integrating project for the humanities is not unique.The divergence of methodology and infrastructure happens continuously and at different levels, formally and informally.One example of integration is CLARIAH, a merger of CLARIN and DARIAH at the governmental level.This amalgamation can be found in different European countries, such as The Netherlands, Germany and Austria.As CLARIN and DARIAH were both partners in PARTHENOS, this goes to show that different modalities of integration do not have to be mutually exclusive, but can take place in harmonic conjunction.

An approach based on the needs of communities
In order to build bridges between disciplines, PARTHENOS' products and services were designed to cover the needs of all the fields involved.Given the multi-disciplinary constellation of the project this exercise was not straight-forward.A researcher in the field of linguistics will often require different information than a policy-or a decision-maker, and a digital archaeology teacher's requirements will be different from those of a technical specialist in metadata interoperability.
Given the active involvement of existing RIs in the social sciences, humanities and cultural heritage disciplines, the project could draw up an aggregated inventory from existing reports relatively quickly around five themes: data policies; standardisation; interoperability of data, services and tools; education and training; and communication needs.To make sure that the developed solutions would be both practical and applicable, the project decided to rely on real world scenarios.Consistency was ensured by applying a use case-based approach as described by American computer scientist Alistair Cockburn. 12According to this methodology, a use case description should consist of specific elements, such as: a descriptive statement of the goal; preconditions, describing what is necessary for the realisation of this goal; and the definition of a successful outcome.
During the development process, the Deming cycle (Plan-Do-Check-Act) was adhered to. 13is means that by establishing an ongoing dialogue between the group of people that gathered the requirements and the implementation teams, the project continually verified whether the needs of the field were correctly understood.This was tested by presenting users with beta versions and showcases of the tools under development.
An example of a use case in the field of standards is: 'Build a corpus of linguistic data for analysis'. 14While this research scenario would be used primarily by linguists, the creation of a text corpus could be just as useful to a historian who works with a large body of text.This is where the added value of cross-fertilisation between disciplines becomes apparent.By facilitating the exchange of digital methods that are widely applicable, scholars are encouraged to look at those of neighbouring disciplines.

The need for common policies and implementation strategies
One of the obstacles PARTHENOS identified, is the scattered nature of research data management policies.Best practices are often developed within disciplinary boundaries as communities are best aware of their specific needs and the distinctive character of the research assets they work with.An added benefit is that adoption is easier encouraged within rather than across fields.While these are indeed strengths of discipline-specific guidelines, they also increase the risk of tunnel vision as researchers from the same field are more likely to interpret data by using the same prior knowledge.
The humanities disciplines are sometimes considered part of the 'long tail of science'.This means that data-centric research is perceived as less relevant for these fields than, for example, physics and biology.The authors of this article consider this position to be only part of the story.While data are increasingly important in history, archaeology and related fields, the tools designed for big data research are often not appropriate for humanities data sets, which tend to be smaller and more variable in format and structure.Moreover, humanities research data does not yet have a well-established tradition of digital publication and is therefore often neither discoverable nor accessible.A lack of awareness of best practices in data management, lies at the heart of this risk of creating 'dark data' (a qualification that hints at their lack of findability). 15 the contrary, shared best practices enable researchers from different disciplines to work from a point of mutual understanding.Also, researchers can more easily reuse each other's results, allowing them to build on each other's observations.It is for these reasons that PARTHENOS decided to make the current, distributed and previously unmapped landscape of research policies more comprehensible and accessible.
The following three products can offer guidance in making research data FAIR: 16 1.)The PARTHENOS Policy Wizard.An interface which allows users to find information about policies which are relevant to their discipline and tasks through intelligent categorisation.

2.)
The PARTHENOS Data Management Plan (DMP) template was created, building on the Horizon2020 template while enriching and tailoring it with specifications from humanities disciplines.The content of this template was derived from a survey carried out among the consortium's experts and describes the life cycle of data creation, -collection, -archiving andpreservation.

3.) PARTHENOS Guidelines.
The 'PARTHENOS Guidelines to FAIRify data management and make data reusable' are offered as common recommendations, aimed at building bridges between different, although tightly interrelated, fields and stakeholders within the humanities. 17

The need for information on standards
Like policies, standards also constitute an important form of consensus on how humanities research data can best be processed and stored.As a key element to interoperability and reusability, they play a central role in any field.Contrary to policies, standards in themselves are non-legally binding methodological or technical specifications.Also, they adhere to the following three characteristics.Standards are: 1.) the result of a consensus building activity; 2.) publicly available, and; 3.) maintained regularly.
Simultaneously, through the mindset and practices standards create, they also build a common cultural background among the communities that adopt them, increasing the probability and feasibility of future collaboration.
The humanities field is no stranger to standards.The Text Encoding Initiative (TEI) guidelines garnered significant support since their inception in various scholarly domains, ranging from history to literary studies; ISO comprises a specific technical committee dedicated to language resources, which provided various standards for the representation and annotation of linguistic content; 18 the natural interaction between scholars and cultural heritage institutions made standards such as EAD (Encoded Archival Description) an integral part of their work; 19 and the semantic representation of cultural heritage data has been standardised in CIDOC CRM (ISO 21127:2006), a high-level compatibility framework. 20ereas there are scholarly groups that have already been active in defining and using standards, the authors of this article observed thatespecially among newcomers to digital methodsthere is a lack of precise knowledge about standardisations. 21To overcome this, PARTHENOS devised the Standardization Survival Kit (SSK), a tool that illustrates both the importance of standards in the research process, as well as the usefulness of designing an online environment that allows researchers to access relevant reference material.Digital specialists from the various PARTHENOS research communities designed scenarios that cover all types of scholarly domains and methodologies, such as the management of archaeological field surveys and the usage of laser techniques for conservation practices in heritage science. 22e need for interoperability As a cluster project, a challenge that PARTHENOS was in a unique position to tackle was the integration of knowledge through interoperable data.The results of a recent international survey among archivists show that interoperability was considered important almost unanimously, as respondents believe it enhances the findability of objects, that it can make the mutual relations between data more visible and that it allows archives to become part of a "wider context and (…) information flow". 23formation management for RIs is both an epistemological and a technical challenge.The success of digital infrastructure depends on its fitness to facilitate researchers in their collaborative development of knowledge. 24One could argue that creating an RI involves building a community that as yet only partially exists towards a goal that as yet has not been fully understood.Therefore, the information integration task is mishandled if it is reduced to the question of which common system to adopt or what standard to enforce.Such goals can only be reached by mutual agreement.
In this light, PARTHENOS created a conceptual model around RI management itself, which models datasets, software, services, projects and actors andmost importantlythe contextual relations between them.This conceptual model, the PARTHENOS Entities Model (PEM), provides the means to represent information on research assets in an accurate yet overarching way.It provides possibilities to classify servicessuch as hosting, curation and e-servicesand proposes a distinction between volatile and persistent digital objects, which makes the evolution of software and datasets visible.In line with the need for the model to be nonprescriptive as was described above, the framework does not impose a form of documentation, but provides a semantic model which allows the translation of existing data about research assets into a common representation.The model is aligned to CIDOC CRM to support interoperability with a wide variety of contemporary and future datasets.

Skills, professional development and advancement
As described above, PARTHENOS did not only cover the technical aspects of RIs.The human network sustaining technical infrastructure and underlying data, as well as the act of making humanities researchers more aware of the potential of the computational methods that can be applied to analyse them, were considered just as important. 25This is why a part of the project specifically focussed on offering the means to learn about both digital humanities research and the world of RIs.
Since the first ESFRI roadmap in 2006, there has not only been a rise in the coordinated development of RIs in Europe.Significant changes have also been taking place elsewhere in the wider research ecosystem.Researcher careers have become, if anything, more precarious and less able to follow clear, pre-determined pathways.In the United States, this phenomenon became known as 'Alternate Academy,' or 'Alt-Ac', 26 and while it is widespread as a phenomenon, the discourse around it has largely risen out of the digital humanities, where interdisciplinary, applied and collaborative approaches open up wider perspectives than the ones which might be found in established disciplinary approaches.
Accessing these opportunities requires a different perspective, different networks and different skills, however, than is normally provided by higher education institutions.Digital RIs are optimised for sustained work in teams, for creation rather than exploration, for the flexibility to harness technologies, policies and processes that are themselves still in development.This shift in requirement is evocative of how Rockwell and Sinclair describe the challenge of DH pedagogy, with RIs representing a rethinking of teaching needs from the most fundamental level: 'One can think through a digital humanities curriculum in three ways.One can ask what should be the intellectual content of a program and parse it up into courses; one can imagine the skills taught in a program and ensure that they are covered; or one can ensure that the acculturation and professionalization that takes place in the learning community is relevant to the students'. 27s infrastructures are inherently committed to developing this third path, they are optimised for sustained work in teams, for creation rather than exploration, for the flexibility to harness technologies, policies and processes that are themselves still in development.
The realisation of this model for arts and humanities training has been a key component of the PARTHENOS project which approached this challenge through three mechanisms.The first of these is the PARTHENOS On-Line Training Suite, a collection of Open Educational Resources (OERs) developed by and for the infrastructure community. 28Unlike other peer platforms, such as DARIAH Teach, 29 or the CLARIN VideoLectures portal, 30  On-line training alone is, of course, a blunt instrument, so PARTHENOS also engaged with two other modes by which researchers hone their skills and build their networks in the digital age.The first of these is an analysis of Transnational Access, long a feature of European RIs, butin its original policy definitionnot always a comfortable match with organisations that may focus on virtual access.Given the importance of problem-focused, contextualised development of skills for the digital humanities, however, it is critical that we understand how the virtual sits alongside the physical access, in particular for those researchers who may be more advanced in their careers, or at a turning point in their research project or approach. 31condly, PARTHENOS has partly closed the gap between formal education programmes, such as those in universities, and the knowledge creation and transfer modes of RIs through a co-creation and exchange of curricular models with some of our university partners.Through these modes of engagement, the project aspired to create a more fluid transfer of skills and people between these two essential poles in arts and humanities research.

Communication, dissemination and outreach
Increasingly, researchers are encouragedand often obligedto include a communication and dissemination plan in their proposals.Both legal frameworks and moral appeals inspired a significant increase in the creation of open digital research data.The potential for research, which lies in this wealth of data, is unequivocal.Scholars however, indicated that the amount of pluriformity in the way these data are disseminated, increasingly creates 'disaggregated traditional scientific output' within an already fragmented communication system. 32 terms of dissemination, the challenge PARTHENOS faced was twofold.The first one was embedded in the design of the project itself, as the project not only built, but also opened up an ecosystem in which the tools and services described throughout this paper provide the envisioned benefits of integration and interoperability.By doing so, the project aspired to provide the means to approach this diversity in humanities research data and to deal with its tremendous pluriformity, thus providing an answer to fragmented dissemination systems.
Secondly, PARTHENOS itself needed a deliberate dissemination strategy for its own output to be discovered and used by scientific and professional communities.These two challenges existed by no means in isolation.The success of PARTHENOS' outreach activities determined to a large extent whether the project would reduce 'complexity'as Schroeder, Fry and de Beer describe the humanities' fields diverse outputor added to the very problem.
A recurring concern around big projects is their 'one size fits all' approach.PARTHENOS, however, was aware that both the information needs, as well as the ways in which different stakeholders are best addressed, differ per target group.This is why in the PARTHENOS' Communication Plan, different stakeholder groups were defined, allowing the project to match communication channels accordingly.However, these roles were never envisioned as a straightjacket.Rather, 'they merely exist to highlight the heterogeneous nature of the PARTHENOS' stakeholders, and to emphasise the need for a tailored approach to communication and dissemination, rather than act as a prescriptive classification'. 33 34 For a humanities RI, the direct and the close audience formulated in Figure 1 above could be regarded as evident.While researchers are its most prominent users, GLAM institutions are often important providers of data and expertise.However, as an audience, society at large was considered just as important.Interested individuals outside academia or museums made important contributions to (digital) humanities research via crowdsourcing events, hackathons or otherwise.Bearing this in mind, all PARTHENOS' outlets (the website, the newsletter, Twitter etc.) and services are open access in order to not raise any institutional barriers.he open format of communication and dissemination PARTHENOS pays heed to three main points of criticism of large-scale infrastructures, namely that they cannot successfully address a heterogeneous field, that they are a place of exclusiveness and that they, rather than promoting innovation, are a restrictive force in that RIs are project-centred and enforce standards.Through its open communication strategy however, PARTHENOS always aspired to be a 'loosely coupled ecosystem of services and activities', rather than a prescriptive force in an ivory tower. 36

Conclusion: a process of constant evolution
One of the first observations that was made in this article, is that the start of the Digital Humanities can be traced back to around mid-twentieth century.Given the seventy-year timeframe between then and now, it logically follows that the development of the field has been one of steady evolution.This conclusion will focus on how PARTHENOS reflects on its role as a part of that evolution in terms of encouraging transnational and interdisciplinary research and how the sustainability of this impact is expected to last beyond the temporal and financial limitations of the project.
The sustainability of assets created by largescale infrastructures is by no means a given.As poignantly stated in a CENDARI report on sustainability "the Digital Humanities landscape is littered with projects that were not sustained by or for their intended user community". 37The report proposes several recommendations to overcome this situation that were successfully applied during and after the project, such as making sustainability planning an integral part of the project and sharing knowledge across affiliated projects.The involvement of various existing Humanities RI's in PARTHENOS was indeed crucial in making sure that the project did not try to reinvent the wheel.Earlier in this paper, the authors explained how the composition of PARTHENOS around existing projects led to a clear vision on the needs of an internationally and disciplinary diverse group of communities.This did not only contribute positively to the project's success in understanding and targeting these requirements, but also saved a lot of time at the start of the project as PARTHENOS could built on earlier user surveys.
Vice versa, this head start for PARTHENOS went hand in hand with the sustainability of the other projects that preceded it and fed into it.This is not to say that PARTHENOS itself was not also subject to its own disciplinary and geographic limitations.As the consortium consisted of European RIs it was certainly, at least to some degree, subject to its own regional bias.At the same time, partners in PARTHENOS were aware that the challenges addressed by the project require global solutions.To give an example of this, the fact that best practices from countries outside Europe were also included in the offering of the PARTHENOS Policy Wizard is testament to this global perspective.
Among the best practices the Wizard references, the Alexandria Archive Institute's Guidelines for web-based data publication in archaeology and the Library of Congress' planning document for the sustainability of digital formats, both developed within the United States, can also be found. 38Admittedly however, such examples are more the exception than the rule, and despite the project's transnational focus, the question of how to optimally include a truly global perspective will remain relevant for any future European project.
Another challenge the project encountered is need to strike a delicate balance between being ambitious as a forerunner on one side and being mindful of the current status of the digital humanities field on the other.Earlier in this article, the authors explained how PARTHENOS developed tools to increase the awareness around standards.At the same time, it is also clear that there are more than one reason why certain standards are not always universally accepted and embraced.Some researchers for instance might find that existing standards do not (yet) accurately reflect their data.An existing taxonomy might not cover a specific research subject because the field is relatively new, relatively niche or simply never found its way to an existing vocabulary.This illustrates that the acceptance of standards does not necessarily follow a clearly defined path, nor does it happen overnight.This can be demonstrated by the development of the model CIDOC CRM, to which the PARTHENOS Entity Model complies.The amount of extensions that have been created over the years for specific fields or practices demonstrates that the model did not originally meet the needs of every individual researcher.CRMba and CRMarchaeo for documenting archaeological excavations and CRMCR for restauration practices are three such extensions. 39 the case of the earliest two however, significant efforts to integrate the extensions have also been made. 40This illustrates that, firstly, a model such like CIDOC CRM will always be work in progress.Secondly, these ongoing integration efforts demonstrate how an open, active and devoted multidisciplinary community can continually improve on such a model, working towards true interdisciplinarity along a bottom-up approach.PARTHENOS found that organising public events, workshops and webinars, thereby involving both project internal and external scholars, was crucial in facilitating this process of co-creation.
As PARTHENOS approached completion, the involvement of the existing RIs was once again essential for sustaining what has been created beyond the projects lifetime.Several examples of how PARTHENOS' project assets were adopted by these partners illustrate their value.

DARIAH migrated the material of the PARTHENOS Training Suite to the DARIAH-Campus
where it not only continues to find an audience, but where it will also remain part of an environment with an active community behind its further development.Moreover, the Campus now also has a wider range of training materials to showcase.The project ARIADNEplus adopted the PARTHENOS Entity Model and will continue to increase the reach of the 'PARTHENOS Guidelines to FAIRify data management and make data reusable' into more languages.At the time of writing, PARTHENOS materials around topics such as research data management and ontologies are still being used in academic curricula, such as King's College London's Master's course in Digital Humanities and the University of Missouri iSchool.
If there is one overarching conclusion that can be drawn from the PARTHENOS project, it is that planning sustainable impact can only be achieved when approached holistically.In the diverse landscape that is the digital humanities ecosystem, PARTHENOS never stood on its own.On the contrary, the project not only originated from the existing ecosystem, but also successfully fed back into it with the PARTHENOS legacy being built upon by the current Humanities RIs and their respective communities.
the PARTHENOS Suite contains materials presented outside of formal curricula which reflect the core requirements and values within infrastructural work.This focus on the collaborative and integrative makes the materials in the Training Suite unique, and applicable to both independent learners and educators looking to expand their knowledge of such issues as what RIs are, how they are managed, how to understand and manage humanities data, what collaboration means for the community, and other related issues.The organisation of PARTHENOS webinars allowed for both additional learning opportunities, as well as the creation of newly created in-depth content for the Training Suite through their recordings.