File(s) not publicly available
Improved learning for hidden Markov models using penalized training
presentation
posted on 2023-06-07, 21:41 authored by Bill Keller, Rudi LutzIn this paper we investigate the performance of penalized variants of the forwards-backwards algorithm for training Hidden Markov Models. Maximum likelihood estimation of model parameters can result in over-fitting and poor generalization ability. We discuss the use of priors to compute maximum a posteriori estimates and describe a number of experiments in which models are trained under different conditions. Our results show that MAP estimation can alleviate over-fitting and help learn better parameter estimates.
History
Publication status
- Published
ISSN
0302-9743Publisher
Springer-VerlagVolume
2464Pages
8.0Presentation Type
- paper
Event name
AICS 02: Proceedings of the 13th Irish International Conference on Artificial Intelligence and CognitiveScienceEvent location
LIMERICK, IRELANDEvent type
conferenceISBN
3540441840Department affiliated with
- Informatics Publications
Notes
Originality: This was the first application within NLP of penalised training of Hidden Markov Models using Dirichlet priors over the emission probabilities of the model. Rigour: The paper derived the necessary EM update rule incorporating the Dirichlet prior, and described emiprical results comparing learning with this prior with several other priors recommended in the literature. The data consisted of the first 5000 POS tagged sentences from the BNC corpus, split into training and test sets. All results were obtained using 10-fold cross validation, and were shown to be statistically significant. Significance: The paper showed that the use of Dirichlet priors (with the Dirichlet distribution parameters set proportional to the normalised frequencies of the observation symbols in the training data) consistently enabled the learning of better performing models. This result was robust across model sizes and variations in initial conditions. Additionally, the results cast doubt on claims by Brand that minimum entropy priors gave good results, suggesting the need for further work in this area. Since this paper was written use of Dirichlet priors (and more recently Dirichlet Process priors) has become widespread. Outlet: this was a fully (3 referees) refereed international conferenceFull text available
- No
Peer reviewed?
- Yes
Editors
RFE Sutcliffe, M Oneill, M Eaton, C Ryan, NJL GriffithLegacy Posted Date
2012-02-06Usage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC