Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

University of Kentucky

Series

Data Mining

Publication Year

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Auditing Snomed Ct Hierarchical Relations Based On Lexical Features Of Concepts In Non-Lattice Subgraphs, Licong Cui, Olivier Bodenreider, Jay Shi, Guo-Qiang Zhang Feb 2018

Auditing Snomed Ct Hierarchical Relations Based On Lexical Features Of Concepts In Non-Lattice Subgraphs, Licong Cui, Olivier Bodenreider, Jay Shi, Guo-Qiang Zhang

Computer Science Faculty Publications

Objective—We introduce a structural-lexical approach for auditing SNOMED CT using a combination of non-lattice subgraphs of the underlying hierarchical relations and enriched lexical attributes of fully specified concept names. Our goal is to develop a scalable and effective approach that automatically identifies missing hierarchical IS-A relations.

Methods—Our approach involves 3 stages. In stage 1, all non-lattice subgraphs of SNOMED CT’s IS-A hierarchical relations are extracted. In stage 2, lexical attributes of fully-specified concept names in such non-lattice subgraphs are extracted. For each concept in a non-lattice subgraph, we enrich its set of attributes with attributes from its ancestor …


Mining Non-Lattice Subgraphs For Detecting Missing Hierarchical Relations And Concepts In Snomed Ct, Licong Cui, Wei Zhu, Shiqiang Tao, James T. Case, Olivier Bodenreider, Guo-Qiang Zhang Jul 2017

Mining Non-Lattice Subgraphs For Detecting Missing Hierarchical Relations And Concepts In Snomed Ct, Licong Cui, Wei Zhu, Shiqiang Tao, James T. Case, Olivier Bodenreider, Guo-Qiang Zhang

Computer Science Faculty Publications

Objective: Quality assurance of large ontological systems such as SNOMED CT is an indispensable part of the terminology management lifecycle. We introduce a hybrid structural-lexical method for scalable and systematic discovery of missing hierarchical relations and concepts in SNOMED CT.

Material and Methods: All non-lattice subgraphs (the structural part) in SNOMED CT are exhaustively extracted using a scalable MapReduce algorithm. Four lexical patterns (the lexical part) are identified among the extracted non-lattice subgraphs. Non-lattice subgraphs exhibiting such lexical patterns are often indicative of missing hierarchical relations or concepts. Each lexical pattern is associated with a potential specific type of error. …