University of Sussex
Browse

File(s) not publicly available

Semi-supervised training of a statistical parser from unlabeled partially-bracketed data

presentation
posted on 2023-06-08, 07:33 authored by Rebecca Watson, Ted Briscoe, John Carroll
We compare the accuracy of a statistical parse ranking model trained from a fully-annotated portion of the Susanne treebank with one trained from unlabeled partially-bracketed sentences derived from this treebank and from the Penn Treebank. We demonstrate that confidence-based semi-supervised techniques similar to self-training outperform expectation maximization when both are constrained by partial bracketing. Both methods based on partially-bracketed training data outperform the fully supervised technique, and both can, in principle, be applied to any statistical parser whose output is consistent with such partial-bracketing. We also explore tuning the model to a different domain and the effect of in-domain data in the semi-supervised training processes.

History

Publication status

  • Published

Page range

23-32

Presentation Type

  • paper

Event name

Tenth International Conference on Parsing Technologies

Event location

Prague, Czech Republic

Event type

conference

ISBN

978-1-932432-90-9

Department affiliated with

  • Informatics Publications

Full text available

  • No

Peer reviewed?

  • Yes

Legacy Posted Date

2012-02-06

Usage metrics

    University of Sussex (Publications)

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC