Using Distributional Similarity to Organise BioMedical Terminology

Weeds, Julie, Dowdall, James, Schneider, Gerold, Keller, Bill and Weir, David (2005) Using Distributional Similarity to Organise BioMedical Terminology. Terminology, 11 (1). pp. 107-141. ISSN 0929-9971

Download (367kB) | Preview


We investigate an application of distributional similarity techniques to the problem of structural organisation of biomedical terminology. Our application domain is the relatively small GENIA corpus. Using terms that have been accurately marked-up by hand within the corpus, we consider the problem of automatically determining semantic proximity. Terminological units are dened for our purposes as normalised classes of individual terms. Syntactic analysis of the corpus data is carried out using the Pro3Gres parser and provides the data required to calculate distributional similarity using a variety of dierent measures. Evaluation is performed against a hand-crafted gold standard for this domain in the form of the GENIA ontology. We show that distributional similarity can be used to predict semantic type with a good degree of accuracy.

Item Type: Article
Keywords: distributional similarity, biomedical terminology, semantic proximity, ontology
Schools and Departments: School of Engineering and Informatics > Informatics
Subjects: Q Science > QA Mathematics > QA0075 Electronic computers. Computer science
Depositing User: Chris Keene
Date Deposited: 27 Feb 2008
Last Modified: 07 Mar 2017 06:40
Google Scholar:16 Citations

View download statistics for this item

📧 Request an update