Crystallography Open Database


Advice to potential CIF donators: fair practices

How the CIF data have to be built and who can deposit them? What is fair and what would be unfair? This is a quite complex matter. The opinion of an US professor of law about crystallographic data copyright is certainly not useless here as a starting point:

Re: Crystallographic data copyrights
From: Eben Moglen
Date: Sun, 04 May 2003 09:02:21 -0400

I will have to be brief. If you need to follow up, let me know.

I assume US law governs throughout; an inaccurate but necessary assumption here.

If you extract only the actual coordinate data you have no copyright liability. One cannot copyright facts, only the expression incident to factual reporting. This principle was recognized by the US Supreme Court in 1915 with respect to news reports sent by telegraph. The idea/expression distinction has been held by the Supreme Court to prevent assertion of copyright over telephone white pages, where there is no originality in the concept of alphabetic organization of data. More complex forms of association or organization of data might give rise to claims.

You should move quickly. Proposals for database protection in the US and Europe will close up vast areas of human knowledge within the next decade. Make this data free soon, or you risk losing the chance. How to license your data so that everyone is compelled to make free their improvements or accessions to it is another subject.

Best regards.
Eben Moglen
Professor of Law
Columbia Law School, 435 West 116th Street, NYC 10027
General Counsel, Free Software Foundation

A. Good practices

We encourage crystallographers to submit CIFs containing unit cell parameters, coordinates and (when available) atomic displacement parameters (etc) for materials they are interested in seeing in the COD. Researchers may submit CIFs from their own published work, from soon-to-be submitted papers or even for structures that will not be published. Researchers are also encouraged to submit CIFs generated from any paper published in the open literature.

Be aware that if you submit data prior to publication, journals can complain that your work is already published, and reject your manuscript. So, check about the preprint policy of the journal.

About creating/copying/extracting operations involved in CIF generations:

B. Questionable practices

Is copy-paste from a PDF file a questionable practice? We are not absolutely sure. This is a new digitalized form of the "original publication". Some other special cases can be considered here:

C. Practices to be discouraged

Copying data directly from commercial databases is prohibited by the owners of those databases (as a user, you may have signed a license, though some old CD-ROMs may have no license text inside). For instance, see the Terms & Conditions of ICSD. And maybe then read the email from Prof. Eben Moglen a second time.

D. What is an ideal CIF? The Cartesian view and a disclaimer

An ideal CIF is a file free of any typos and presenting a high quality crystal structure. Obviously, and using maximal logic, this means that there would be no way to distinguish an ideal CIF in the COD from an ideal CIF in the CSD, ICSD or CRYSTMET commercial databases. This means that a crystallographer who would decide to copy a CIF from the commercial databases, and would remove the typos, would have illegally built that ideal CIF, indistinguishable from the ideal CIF typed from the paper-form literature, or obtained by a copy-paste from a PDF file or etc. Thus, without that possibility to detect fraud, the COD cannot be consider in any way responsible for the possible use of bad practices by Crystallographers.

What are the funding agencies saying?

National Institutes of Health (NIH)

In 1999, the National Institutes of Health (NIH) released a statement detailing their policy on the deposition of atomic coordinates into structural databases. The policy clearly states that:

"the new NIH policy requires that atomic coordinates from X-ray crystallographic and nuclear magnetic resonance experiments that were supported by NIH grants to be deposited into the appropriate structural database at the time of submission of a research article drawing conclusions from these data. This information should be released immediately at the time of publication."

The Royal Society

In 2003, the Royal Society released a report entitled "Keeping science open: the effects of intellectual property policy on the conduct of science" that encouraged the free access of scientific data. Some of the more relevant portions of the report are cited below:

"Science relies on the free and rapid exchange of ideas and information. Intellectual Property Rights (IPRs) can protect creative work and investment in all areas, but may also restrict this exchange. This report considers whether the progress of science has been affected by the interpretation and use of IP policies, and makes recommendations for improvement." (...)

"We recommend that scientists ensure that any publicly funded data that are made available to private databases are done so non-exclusively, and that at least one repository of the information is liberal regarding access to and use and manipulation of the data." (...)

"The House of Lords inquiry concluded that there were IP issues to be resolved: What role should private databases play in the information chain? Should private databases be allowed to charge for information that is in the public domain or publicly funded? Should publicly funded databases charge for access?"

"We recommend significant Government support for the organisation, publication and maintenance of data that it has funded, which might otherwise be or become inaccessible. Since the cost of scientific information is high, and the value added by proper access is great, it makes no sense to allow the value of publicly funded data to be constrained by limitations to access in private databases. (...) We recommend that databases with public funding be readily accessible, and be either free or the charge merely be the cost of permitting access or of supplying the information. It may not be appropriate to recover even the cost of supply, since for non-material transfers the administrative cost of collection normally outweighs the value of at-cost revenue. It is particularly important for science in developing countries that access to databases by their scientists is free."

Opinions about Open Access

Scientists for Global Responsibility (SGR)

In 2001, Alan Cottey released a statement detailing how scientific projects could be done in a radically open manner, supported by an Open Science Protocol.

The Nature journal

The editorial entitled "Free access?" published in the "Nature Structural Biology" journal in the year 2000, supports the release of atomic coordinates upon publication. An excerpt from the "Free access to reagents and structural coordinates?" states that:

"we strongly encourage immediate release of coordinates, and in fact we find that most authors do choose to release their coordinates upon publication. It is now time, especially given the current discussions regarding increased access to all scientific information, to re-examine our policy, to see if it makes sense to dispense with the hold altogether — and we will be looking at this issue over the next few months."

International Union of Crystallography (IUCR)

In the report of the Second Workshop on the Open Archives Initiative meeting, held in 2002, the IUCr CODATA Representative Brian McMahon stated that:

"I feel that the IUCr should certainly consider implementing an OAI-PMH based data server, and perhaps also run harvester software. Among the possible applications are:

Reciprocal Net

The Reciprocal Net project is constructing and deploying an extensive distributed and open digital collection of molecular structures. And if you send your crystal data there, why not to the COD as well?

Public Library of Science (PLOS)

The Public Library of Science is a nonprofit publisher and advocacy organization founded to accelerate progress in science and medicine by leading a transformation in research communication.

Research Information

An article entitled "A new model for the knowledge economy" that was featured by the Research Information in the spring of 2003, offered some very interesting insights into the open-access publishing:

"(...) the shift to open-access publishing is interesting. For a start, it is real, and it entails a fundamental shift of IP ownership, from the publisher back to the author. Not only that, the shift is accompanied by a concomitant upending of the reward system. Unlike the publishers, academic authors do not wish to profit from owning intellectual 'property', but rather to benefit the wider research community by allowing free access to the content ad infinitum. The rewards they do reap are the same as always: exposure, recognition within their circles, and credit for their achievements - not direct financial pay-back - since even the resultant increased access to resources for themselves only feeds back into the system. For this reason academics were perhaps always among the most likely to catalyse a new approach, but it is no less significant for this fact. The shift is revolutionary on a social as well as a business dimension: it proves not only that endeavours can be motivated by non-monetary rewards, but that indeed there is a strong drive to create systems to support this philosophy. (...)"

"Meanwhile the International Council for Science (ICSU) has collected input from the international science and technology community, ahead of the first stage of the UN World Summit on the Information Society (WSIS), coming to Geneva later this year. ICSU's focus is on four key themes, which include: 'Ensuring universal access to scientific knowledge internationally' and 'Scientific data and information as a global public good'. UNESCO is also involved in this process, and states that it upholds 'universal access to information, equal access to education, cultural diversity and freedom of expression' as 'essential principles for developing equitable knowledge societies'. (...)"

"Similar themes were addressed at a recent seminar entitled 'Knowledge: common heritage not private property,' organised by the UK's Scientists for Global Responsibility, in which a number of scientists discussed elements of a new collaborative paper entitled Towards a Convention on Knowledge and proposed alternative approaches to the processes of scientific enquiry, information-dissemination and assigning IP rights. (...)"
(IP = Intellectual Property)

Petition for Open Data in Crystallography

On the 5th of May, 2005, the COD Advisory Board placed a Petition for Open Data in Crystallography on the COD website welcoming all interested parties to sign either in support or in opposition to the ideas presented in the petition. On the 29th of December, 2008, the collection of signatures was ended with the overwhelming support in favour of the petition. The original petition text and the results can be viewed at the now closed petition page.

Final word

A form should be soon (?) available on the Web for building CIFs.

Visit the IUCr Web page to know more about CIFs.

Ask Google about CIFs: it does not mean exclusively "Canadian Institute of Forestry."

Back to the COD