Computional Learning of Languages
This thesis focuses on the Gold model of inductive inference from positive data. There are several aspects in which the model appears unsatisfactory for language learning: the class of families of learnable languages is highly restricted; even if a family is learnable, there exists no uniform method to obtain a learner for it, and the learner itself is complex. In this thesis, some such criticisms are being addressed. It is shown that no automatic synthesis of a learner from the description of a learnable family is possible. Nevertheless, in some special cases, this synthesis can be achieved and a general result is developed. In order to make the learner simpler, it is stipulated that the learner can change its guess only when the guess is inconsistent with the input evidence. Such a conservative learner never overgeneralizes. Exactly learnable families are characterized for prudent learners with the following types of constraints: (0) conservative, (1) conservative and consistent, (2) conservative and responsive, and (3) conservative, consistent and responsive. It is also shown that, when exactness is not required, prudence, consistency and responsiveness, even together, do not restrict the power of conservative learners. Conservative learners are simple in only one respect; even though it is easy to determine when to make a new guess, it is still hard to know what this guess should be. Finally, a learner that exploits pattern evident in the input is developed. Absence of a particular string over a suitable interval of the input can be viewed as a kind of "indirect negative evidence". Now, the learning criterion needs to be weakened to allow limited failure. It is shown that any family of languages can be learned with probability 1 from stochastic input, provided something is known about the probability distribution according to which the input is presented. Given the family, the learner is uniformly constructible. Further, the behavior of the learner is simpler in many aspects. It is expected that a variety of other natural constraints can be imposed on this learner without additional cost.
computer science; technical report
Previously Published As