Craven Group meeting
April 12, 2021
Yuriy Sverchkov
Often we encounter databases of items where these items are described by unstructured or semi-structured metadata.
Example: Sequence Read Archive
"age": "24",
"cell_type": "dermal fibroblast",
"isolate": "not applicable",
"organism": "Homo sapiens",
"sampling site": "shoulder",
"sex": "female",
"stimulus": "UV exposed"
"gender": "female",
"individual": "patient2",
"source_name": "kidney",
"tissue": "tumor"
⇓
Variable-length text input
⇒
Variable-length ontology+relationship output
Cast as classification
We identify candidate terms using a text reasoning graph with a domain-agnostic set of rules