University of Sussex
Browse
I13-1137.pdf (241.43 kB)

Induction of root and pattern lexicon for unsupervised morphological analysis of Arabic

Download (241.43 kB)
presentation
posted on 2023-06-08, 20:35 authored by Bilal Khaliq, John Carroll
We propose an unsupervised approach to learning non-concatenative morphology, which we apply to induce a lexicon of Arabic roots and pattern templates. The approach is based on the idea that roots and patterns may be revealed through mutually recursive scoring based on hypothesized pattern and root frequencies. After a further iterative refinement stage, morphological analysis with the induced lexicon achieves a root identification accuracy of over 94%. Our approach differs from previous work on unsupervised learning of Arabic morphology in that it is applicable to naturally-written, unvowelled text.

History

Publication status

  • Published

Page range

1012-1016

Presentation Type

  • paper

Event name

6th international joint conference on natural language processing (IJCNLP)

Event location

Nagoya, Japan

Event type

conference

Event date

14-18 October 2013

Department affiliated with

  • Informatics Publications

Full text available

  • Yes

Peer reviewed?

  • Yes

Legacy Posted Date

2015-04-24

Usage metrics

    University of Sussex (Publications)

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC