“The path to academic personal identifiers…

Posted on June 15, 2011


… is littered with the wrecks and remains of many failed projects.”

The varied attempts to start, maintain or promote a single identifier scheme for academics, whether applied locally or internationally,  can arguably be described as being a mix of successes and abject failures. I do not wish to dwell on what a failure is in this respect, but wish to simply define it as being a scheme that is not seen to be a necessary part of an academic’s ‘life’ – that of research, producing outputs, self-promotion, discovery of work and communicating with their collegues and the wider world.

It is far more instructive to look at the schemes that have worked in academia, and also at schemes which a broader section of the population have adopted. The first case that is most instructive is that of RePEc

RePEc (Research Papers in Economics) is a collaborative effort of hundreds of volunteers in 74 countries to enhance the dissemination of research in economics. The heart of the project is a decentralized database of working papers, journal articles and software components. All RePEc material is freely available. Participation in RePEc as a provider only involves the cost of your time in preparing and maintaining metadata describing your publications.

Why is RePEc an important service to consider?

  • The service(s) were created and later shaped due to the needs of its community by volunteers – started by Thomas Krichel in 1993 (in every practical sense), the team has grown over the years.
  • It is a great example of a ‘bottom-up‘ (from the authors/peer group) service, not a ‘top-down’ service (institutional/publishing org driven.)
  • Being a product of the community, there is a great deal of trust in the services. The barrier to interaction with the site is low.
  • It has provided community-policed freedom for Economists to create and garden their own profiles for a number of years – http://authors.repec.org – the service now has over 25,000 author profiles, added primarily by the authors themselves and it is loosely labelled an Author ‘CV’ on the front page of the RePEc website.
    • “The RePEc Author Service aims to link economists with their research output in the RePEc bibliographic database.” (more info)
  • It allows the community of economists to communicate their ideas and discover each other’s related research in a better manner than would be possible without it.
  • It allows the community to self-promote and compare one author’s output to another’s, based on their profiles and citations.

These are the key points that I think have contributed to making RePEc and its collection of services a success – it addresses the real needs of a community who trust it, who can freely add to and correct information about themselves and it allows them to communicate better, to self-promote and compare themselves to their peers and to discover further related research with greater ease.

Why are the publisher’s not fulfilling this role? One of the key pieces of research currency used by Economists is the Working paper – a paper that by its very nature is subject to revision, alteration and hopefully, amelioration. Unlike some other subjects, it is seen to be important for this ‘imperfect’ work to be scrutinised in a more public manner than other academic cultures might tolerate. A Work (used in the FRBR sense) whose contents will shift from the time of first publication, such that many versions of it may exist, are an anathema to conventional fire-and-forget publishing, where what is published may be retracted or given an errata, but its structure and findings are not expected to change.

Some may refer to this sort of output as ‘grey literature’ – I will give the most current (Prague Definition 2010) below:

“Grey literature stands for manifold document types produced on all levels of government, academics, business and industry in print and electronic formats that are protected by intellectual property rights, of sufficient quality to be collected and preserved by library holdings or institutional repositories, but not controlled by commercial publishers i.e., where publishing is not the primary activity of the producing body.”

Under this definition, working papers do fall into this category, although RePEc does not limit authors to listing only these forms of outputs. I consider the lack of a limit here to provide a greater sense of ownership to the community and another aspect that a successful service would likely emulate.

The fact that RePEc can be considered to be ‘owned’ by a community lends trust to its brand, but that is by no means the only way to garner the trust of a community. It’s actions and developments have won the trust of its community over the years. It is perfectly feasible for a publisher or private entity to produce a service which has similar success in this area. For example, consider the SSRN (Social Science Research Network).


  • Like RePEc, the SSRN was created and shaped due to the needs of its community but in this case, the organisation behind it is privately owned.
  • It is focussed on the needs of authors and many of its services, if not all of them, are designed to be used by those within the research peer-group
    • A somewhat subjective example is from their site’s navigation banner – http://www.ssrn.com/hen/index.html – there are options for ‘Top Papers’ and ‘Top Authors’ but nothing for ‘Top Institution’.
  • There is a great deal of trust in the services, as the SSRN spend a lot of effort validating, amending and checking the outputs of its services. While the ethereal nature of ‘download statistics’ may be familiar to many of those who run and administer websites, it is treated with great reverence by the users of the SSRN as the organisation expends great time and effort filtering and heavily examining downloads to render this insubstantial statistic less so. It is the appearance of solidity and formality with which the services are delivered that contributes towards the trust of the community.
  • It allows the community of economists to communicate their ideas and discover each other’s related research in a better manner than would be possible without it.
  • It allows the community to self-promote and compare one author’s output to another’s, based on their profiles and citations.
  • It also responds to criticism and errors within its service rapidly, as it is something that it takes pride in.

Many of the points here, mesh with those from RePEc, including the key (IMHO) ones of trust, comparison amongst peers, self-promotion and discovery. It has more limits than RePEc (contributions to the service are less free in the sense of ‘libre‘) but these same limits provide extra trust in the information provided by the service; an air that the information within is policed well and hard to falsify.

Let’s now consider a fundamentally similar service, similar in all of the above, but it is focus is not academic, and it is free only in the sense of it being ‘gratis’ and absolutely not ‘libre’


  • Like RePEc, Facebook was created and shaped due to the needs of a community but it can be stated quite clearly that the community that it primarily serves are not those of its users. A quick glance at their current terms and conditions (or Rights and Responsibilities) can justify that assertion. Look at point 2, subsection 1 for a legal bombshell of a statement that should worry any user uploading personal videos or pictures to the site.
    • As people tend to blank out when they are presented with T&C legalese, I’ll copy the pertinent section here: …. photos and videos (“IP content”), …: you grant us [opt-out, not opt-in] a non-exclusive, transferable, sub-licensable, royalty-free, worldwide license to use any IP content that you post on or in connection with Facebook (“IP License”)
  • It is focussed on the needs of users as it gained venture capitol based primarily on numbers of users – a ‘potential’ source of profit, rather than profitability in of itself. The organisation had a drive to gain and retain users, and had to do so by offering useful services.
  • There is ironically still a great deal of trust in the services they provide – in my opinion, this is because it is a mainstream service whose privacy transgressions and related reports which would erode that trust are never truly treated to mainstream media coverage.
    • For example, tabloid newspapers make the UK population aware when the UK government or related bodies lose ‘CDs of valuable data‘ by running frontpage-level story campaigns over the course of days without any information on whether that data is being misused, but I have yet to find stories reported with similar intensity or visibility about the times Facebook openly attempted to sell personal data, finding articles from the Independent, Guardian and the Telegraph, individual articles that I doubt made the front pages of any of those.
    • (As an aside, the irony of this Facebook group “We sue facebook if they sell our personal data!”(sic.) provides ample material to mull the issue of trust over: http://www.facebook.com/group.php?gid=52991492388 )
  • Even though the users’ should assess whether they should trust it, Facebook does allow the community of users to communicate and discover each other in a better manner than would be possible without it.
  • It allows the community to self-promote and socially compare based on their profiles.

Let’s now move swiftly on to consider an arguable failure in this realm – Thomson-Reuters’ ResearcherID:


  • It was created primarily due to the needs of a publisher, who needed to keep track of researchers, who published which paper, co-authors and so on. This is actually a direct need of authors as well, but it is a hard notion to convey as each subject area seems to have its own coping mechanisms and acceptable losses when it comes to citations, metrics and the like.
  • It tried to focus on the needs of the users as well, but without the drive from a community that already existed, it was unclear what its focus is. This is perhaps a compromise between the publisher’s desire to make it as widely applicable as possible, and the individual desire to make it relevant to their own, personal community.
  • As this was a product by a publishing business, targeted at no specific community, the service inherited much of the same level of trust that the publishing business has within the wider academic community. That is to say, no-one trusted it to remain open and freely reusable without the threat of a hefty subscription introduced at some point later on. Thomson-Reuters own surveys confirmed that the primary reason ResearcherID was under-used was due to a lack of trust, both longterm and short term, in the service.
  • Without many profiles in the system, it never reached a tipping-point – it wasn’t a useful service to use in order to communicate and discover other researchers in your field of work.
  • Likewise, the ability to self-promote and compare was never tested, as there simply was not a great enough use of this service to do so.

The key failings of the ResearcherID service was that it did not target any community successfully, instead targeting all academic communities, and that it lacked its users’ trust in the service.

What does this mean for ORCID?

Although ORCID is technically a ‘child’ of the ResearcherID project, it does so with the knowledge of its parent failure and hopefully, as a project, will strive to correct this. While Thomson-Reuters play a part in the ORCID project, they are actively trying to relegate themselves to be no more influential in the proceedings and discussions that of any of the members of the boards administering and plotting out the direction for ORCID.

As a member of the Technical Advisory Group for ORCID, you can be sure that the aspects of trust and of community involvement will be at the forefront of any discussions I have with that group.

Posted in: ORCID