Skip to content

Add _dictionary.licence attribute#16

Merged
jamesrhester merged 7 commits intoCOMCIFS:mainfrom
jamesrhester:add_licence
Feb 25, 2026
Merged

Add _dictionary.licence attribute#16
jamesrhester merged 7 commits intoCOMCIFS:mainfrom
jamesrhester:add_licence

Conversation

@jamesrhester
Copy link
Contributor

This addresses #3, using @nautolycus suggested definition. I know this update is not on the critical path for the next release but it looked easy to do.

I have chosen to make the licence "recommended" rather than "mandatory" as absence does not make it impossible to use or identify the dictionary. Absence only makes redistribution problematic.

Not sure if keeping the SPDX identifier in the header is still necessary with this change.

@vaitkus
Copy link
Collaborator

vaitkus commented Feb 19, 2026

Thank you for the PR! Explicit licensing is always useful.

Before merging this PR, may I suggest we first merge PR #7 that enables the automated dictionary checks?

As for this PR I think it would be more useful to have two separate attributes instead of one:

  1. _dictionary.licence_SPDX_id. This would contain only the SPDX licence identifier without the SPDX-License-Identifier: prefix (but with all the extended AND/OR syntax). In cases when a licence cannot be located in the SPDX list, the LicenseRef-<custom identifier> (e.g. LicenseRef-foo) construct can be used and the licence text specified elsewhere. This way the attribute would always hold only the machine-parsable SPDX identifier.
  2. _dictionary.licence_text. This attribute would contain the full text of the licence or a reference to the location where the full licence text is specified (licences can get quite long). In most cases this field would remain unused.

The suggested attribute names can definitely be refined (e.g. maybe _dictionary.licence_id_SPDX is preferred).

How does this sound?

I do not oppose to having this attribute as recommended, however, keep in mind that not specifying a licence does impede the use of a dictionary since, in general, when no explicit licence is given the strictest possible restrictions should be assumed (including use, reuse, etc.). Of course, this does differ from jurisdiction to jurisdiction, but we might want to nudge people to specify the licence from the start.

I think that keeping SPDX identifier in the header comment is still useful. There is a slight risk of these two values getting out of sync, but the comment with the SPDX-License-Identifier: prefix seems to be the standard way of specifying the SPDX licence and a such is recognised by some automated licence checkers. The explicit dictionary attribute is much more robust for some applications (e.g. our parser strips away the comments), but the licence might not get automatically picked up even if we leave the SPDX prefix.

Finally, it might be worthwhile to see if IUCr/COMCIFS has a preferred on the spelling of "License/Licence". I do like the British English spelling more, but it would be nice to keep things consistent throughout.

@nautolycus
Copy link
Collaborator

As for this PR I think it would be more useful to have two separate attributes instead of one

CON: Two data items instead of one, and the question of how to address inconsistencies/conflicts between the two.

PRO: Tying a specific scheme to the single allowed atttribute is not future-proof, if/when the SPDX registry ceases operation. Also the suggested definition has two purposes - contain a rigorous machine-parsable string OR a free-text field that requires human interpretation.

Perhaps _dictionary.licensing_SPDX and _dictionary.licensing_details ? [Note how using the verb form neatly sidesteps the spelling issue :) - or do the Americans perversely use 'licence' as a verb?] I am agnostic as to whether the *_SPDX should contain the SPDX-License-Identifier: prefix, but agree that its use would then be redundant. *_details, in the spirit of other such definitions, is open-ended and could contain text of specific licences, pointers to online texts, qualifiers to the set as stated in the *_SPDX field or otherwise as needed -- with no expectation of machine parsability.

In any case, I agree that the use of an SPDX comment (also) be recommended. A simple use case is to suggest to generic data harvesters (like AI bots) the licensing status of a file before they attempt to do anything with it. Always assuming that would make any difference, of course...

@jamesrhester
Copy link
Contributor Author

I think the CON is manageable, in that we already have many data names that can contradict one another, and also these are for use in dictionaries, which we can apply quality control to in a way that we can't for ordinary CIF files. I think we are agreed to have two data names, and I like @nautolycus 's suggestion for them. I will adjust the PR accordingly.

Use separate spdx identifier and licensing_details attributes to
allow both machine readability and for non-registered licences to
be handled.
@vaitkus vaitkus linked an issue Feb 24, 2026 that may be closed by this pull request
Copy link
Collaborator

@vaitkus vaitkus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the marked changes, looks good to me.

jamesrhester and others added 3 commits February 24, 2026 19:26
Co-authored-by: Antanas Vaitkus <antanas.vaitkus90@gmail.com>
Co-authored-by: Antanas Vaitkus <antanas.vaitkus90@gmail.com>
@jamesrhester jamesrhester merged commit 13cd29c into COMCIFS:main Feb 25, 2026
3 checks passed
@jamesrhester jamesrhester deleted the add_licence branch March 10, 2026 03:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ddl.dic: introduce attributes to specify the dictionary licence

3 participants