Four trends shaping the future of research evaluation

Earlier today I gave a keynote presentation at the 10th anniversary LIS-bibliometrics conference. This post is a summary of the argument that I made in my presentation; you can also view the slides.

The evaluation of research operates at a range of levels within the research system, from individuals to comparisons between nations. The methods and approaches used need to be sensitive to the purpose(s) of the evaluation and the level at which it is applied, and the recently proposed SCOPE framework is a useful framework for thinking carefully about these questions.

It is also important to consider how research evaluation could or should evolve in the future. Making predictions is always hard, but one tool is to consider the potential trends and drivers that might bring about change. These drivers can arise either within or outside the research system itself, and might have a range of implications for evaluation. Considering trends and drivers helps us think about possible futures, and the options that we have in shaping them.

In this post, I want to consider four trends that have the potential to influence research evaluation in the future. My thoughts draw on a range of evidence sources, including recent work commissioned by Research England , an international study (pdf) conducted by Elsevier and Ipsos MORI, and early outputs from a Demos project on 'Research 4.0'. I have also been influenced by my general exposure to the research system and literature.

While the four trends I discuss are not the only factors influencing the future of research evaluation, I do believe they are important ones, that deserve considered attention.

1. The nature of research outputs is changing

Despite extensive digitisation, the fundamental nature of research outputs has remained much the same for four hundred years. Research is disseminated in writing either in short or long form (journal article and book chapters, or books, respectively). There are signs that this is beginning to change, with researchers themselves predicting an increase in the diversity of the output types that they produce. These changes are partly driven by the imperative to reach wider audiences, and the easier availability and distribution of non-text-based media. At the same time, the outputs of research are becoming more openly accessible with evidence suggesting that more than three-quarters of journal article views will be through open routes by the end of the decade.

One of the key drivers changing research outputs is increased collaboration. The outputs of research increasingly have international co-authorship, with analysis indicating that many nations (including the UK) have overseas authors on more than 50% of their 'national' output. There are also trends towards an increasing number of authors on publications.

Researchers are also starting to really take advantage of the digital medium of publishing, with different component of the research process - text, data, and associated outputs - being published separately as citable and linked entities. The push towards more reproducibility of research is also leading to the idea that data and code used in analysis should be published, and there are experiments going on to make the journal article fully executable.

These changes have big implications for research evaluation. Journal articles and books are self-contained entities that are 'finished' at a specific point in time. In the future research outputs are likely to be much more dynamic and need to be considered not in isolation but in the context of other related and linked outputs.

2. Insight from the citation network is increasing in sophistication

Citation data is regularly used as part of research evaluations, notwithstanding their considerable limitations (which I have discussed previously. But many current approaches are essentially limited to counting the citations an article receives, with little attention paid to the network context within which articles sit. The only data from the rest of the network that is routinely considered are the citation counts of supposedly similar articles, in an effort to normalise or contextualise the counts.

But there is richer information contained in the network relationships and the full text of articles, which is beginning to be exploited. Perhaps the longest standing of these methods are co-citation and co-authorship analyses that can be used to investigate dynamic disciplinary groupings and interdisciplinary research (this paper is an example; there is an OA version available).

Methods are emerging to combine information about the citation network with analysis of the full text of articles, such as the approach that has been termed 'semantometrics', which I have written about before. More recent work has sought to use information from the citation network to measure how disruptive an article has been, a property that seems to be not necessarily related to its citation count. The article concerned has also been the subject of a previous post.

Of course, these new approaches come with many of the challenges associated with current citation counting methods, not least the still-poor coverage of some output types and, so, certain disciplines in bibliometric databases. They will no doubt raise their own issues too, but the approaches are likely to become more accessible and mainstream over the coming years. We need to think carefully how they will fit, if at all, into our future responsible evaluations.

3. There is an increasing focus on the culture of research

Recent years has seen an explosion of interest in issues of research culture, culminating the recent report from Wellcome. There is general agreement that all is not right in our research organisations, and considerable debate about the source of the problems or the potential solutions. Despite differing views, there is some agreement, however, that the reward, recognition and evaluation approaches used within the research system do not pay enough attention to issues of research culture.

For example, recent analysis of national research evaluations in 20 nations reveals that there is precious little attention paid to the process of research, or issues of research culture. If we are serious about tackling the challenge of improving research culture expanding the horizons of research evaluation will need to be part of the mix.

I think we need to accept that doing that will need to involve qualitative evaluation approaches. There are also emerging prospects of automated methods (to examine, for example, statistical robustness), or more quantitative approaches. An example of the latter is interesting work looking at gender representation in research articles from UK universities.

Responding to this challenge will need care. It would be easy to design evaluations of research culture that do more harm than good. Keeping the principles of responsible research evaluation front of mind will help to mitigate this risk.

4. AI has the potential to revolutionise research assessment

Finally, we need consider if and how the rapidly expanding and increasingly effective tools of Artificial Intelligence (AI) should be applied to research evaluation. The pace at which AI tools are increasing in power is dramatic, whether the AI is winning complex games of strategy, predicting materials with specific properties, or designing new antibiotics.

There are active experiments in the use of AI for tasks in research evaluation. Microsoft uses an algorithm to determine the 'saliency' of sources in its Microsoft Academic product. In the publishing sector there are experiments like Unsilo, which appears to be able to extract key findings out of articles, as well as identify 'missing' references from the bibliography. The Cochrane Collaboration are also examining the potential for machine learning to assess whether articles should be included in systematic reviews, alongside, but not replacing, human reviewers.

Whether you think AI has a place in research evaluation or not, it will inevitably be raised as a possibility in the near future. The key for the evaluation community is to begin researching this question, so that we have a sound evidence-base on the challenges and opportunities. Just like we needed frameworks to consider the responsible use of citation metrics, we need guidelines for the responsible use of AI in research evaluation.

Those guidelines will need to go beyond a simple technocratic assessment of the abilities of AI, but also include broader considerations of the impact on the research system. For example, a recent report has highlighted that AI-augmented systems are being developed for both the writing of grant proposals and their evaluation, raising the prospect of an 'arms race' of competing AIs that seems unlikely to serve the system well.

The four trends considered in this article are not based on speculation, but on evidence of what is happening now. This doesn't mean that how the trends play out in the future of research evaluation is fixed. There are many future trajectories, and highlighting these trends is aimed at encouraging those in the research system to begin thinking about the implications now. Early thought and action will also inform our response, using the changes and opportunities to build a more effective research system in the future.