File(s) not publicly available
Semi-supervised training of a statistical parser from unlabeled partially-bracketed data
presentation
posted on 2023-06-08, 07:33 authored by Rebecca Watson, Ted Briscoe, John CarrollWe compare the accuracy of a statistical parse ranking model trained from a fully-annotated portion of the Susanne treebank with one trained from unlabeled partially-bracketed sentences derived from this treebank and from the Penn Treebank. We demonstrate that confidence-based semi-supervised techniques similar to self-training outperform expectation maximization when both are constrained by partial bracketing. Both methods based on partially-bracketed training data outperform the fully supervised technique, and both can, in principle, be applied to any statistical parser whose output is consistent with such partial-bracketing. We also explore tuning the model to a different domain and the effect of in-domain data in the semi-supervised training processes.
History
Publication status
- Published
Page range
23-32Presentation Type
- paper
Event name
Tenth International Conference on Parsing TechnologiesEvent location
Prague, Czech RepublicEvent type
conferenceISBN
978-1-932432-90-9Department affiliated with
- Informatics Publications
Full text available
- No
Peer reviewed?
- Yes
Legacy Posted Date
2012-02-06Usage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC