Four of five SSSOM data publications don't follow the specification #428
Replies: 4 comments 18 replies
-
Very important discussion. My experience is that people publish their mappings in SSSOM often to fulfill some externally provided requirement (say, to make a grant authority happy), or end up cherry picking some of the sssom columns/attributes to enhance their existing model without the intention to publish fully compatible SSSOM. I think as is usually the case either FAIR, true reuse is an afterthought. The best way imo to encourage the publication of standard compliant mappings are useful tools: if they can't read a file because it's not formatted correctly, users are more inclined to complain (and the mapping owner is also more inclined to format the mapping to be able to use the tool). My inclination is to invest into a large scale mapping browser like OxO which require mappings to be formatted correctly in order to show (and therefore promote) them - similar to what OLS does for ontologies. |
Beta Was this translation helpful? Give feedback.
-
In fact even that file is not compliant either – the |
Beta Was this translation helpful? Give feedback.
-
The list of known publications has been extended but only two valid examples found so far. In addition the list of data sets at mapping-commons does not look much better:
The errors
tl;dr: only four of 46 SSSOM data sets in the wild strictly follow the SSSOM 1.0 specification. The rest is only usable after intellectual inspection and manual cleanup, if at all. This makes current SSSOM 1.0 impractical for any serious data exchange. Some common errors might be recoverable from with more lax rules in SSSOM 1.1 (e.g. make |
Beta Was this translation helpful? Give feedback.
-
I dealt with the MH mappings and CPATH ones (not yet fixed but will be I am sure), if you can be bothered to share issues with me of other broken SSSOM files, I will get them all fixed. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Searching for real world examples of SSSOM data publications on Zenodo.org listed 5 publications so far. I've tried to parse each of them with sssom-py, only one could be processed:
Looks like SSSOM/TSV is mainly used as write-only format in existing data publications, without actual reuse.
Beta Was this translation helpful? Give feedback.
All reactions