File(s) not publicly available
Learning Stochastic Context-Free Grammars from Corpora Using a Genetic Algorithm.
presentation
posted on 2023-06-07, 20:03 authored by Rudi LutzA genetic algorithm for learning stochastic context-free grammars from finite language samples as described. Solutions to the inference problem are evolved by optimising the parameters of a covering grammar for a given language sample. We describe a number of experiments in learning grammars for a range of formal languages. The results of these experiments are encouraging and compare favourably with other approaches to stochastic grammatical inference.
History
Publication status
- Published
Publisher
Smith G Steele NC and Albrecht RF (Eds) Artificial Neural Nets and Genetic Algorithms. Proceedings of the 3rd International Conference on Artificial Neural Networks and Genetic Algorithms (ICANNGA-97) Norwich UK.Page range
210-214Presentation Type
- other
Event location
Smith G Steele NC and Albrecht RF (Eds) Artificial Neural Nets and Genetic Algorithms. Proceedings of the 3rd International Conference on Artificial Neural Networks and Genetic Algorithms (ICANNGA-97) Norwich UK.Event type
conferenceDepartment affiliated with
- Informatics Publications
Notes
Originality: This paper (and its earlier conference version) was one of the first papers to apply a genetic algorithm to the problem of inferring stochastic grammar from a corpus. It also used a novel prior over stochastic context free grammars in order to bias the search towrads simpler grammars. It is also the first to compare this technique to use of the Inside-Outside algorithm. It also made use of an entirely novel crossover operator within the GA, which resulted in faster discovery of good grammars. Rigour: the fitness function for the GA is developed within a Bayesian approach to grammar induction. Results for all language learning tasks were given as averages over 20 runs. Languages learnt were the standard ones for the area, plus 3-symbol palindromes which had previously been considered too hard. Significance: We have demonstrated that a genetic algorithm, making use of a prior over grammars can successfully learn SCFGs from corpora. Good results are obtained on a range of formal languages using a prior that is designed to prefer simpler grammars. Grammars found by the GA are both small and accurate in comparison to those produced by the Inside-Outside algorithm. Impact: This paper, and its earlier conference version, have been cited a total of 20 times (Google Scholar) Outlet: A leading journal in its area. Impact factor 1.822Full text available
- No
Peer reviewed?
- Yes
Legacy Posted Date
2012-02-06Usage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC