Improving Machine Learning Approaches to Noun Phrase Coreference Resolution
Human speakers generally have no difficulty in determining which noun phrases in a text or dialogue refer to the same real-world entity. This task of identifying co-referring noun phrases --- noun phrase coreference resolution --- can present a serious challenge to a natural language processing system, however. Indeed, it is one of the critical problems that currently limits the performance of many practical natural language processing tasks. State-of-the-art coreference resolution systems operate by relying on a set of hand-crafted heuristics that requires a lot of time and linguistic expertise to develop. Recently, machine learning techniques have been used to circumvent both of these problems by automating the acquisition of coreference resolution heuristics, yielding coreference systems that offer performance comparable to their heuristic-based counterparts. In this dissertation, we present a machine learning-based solution to noun phrase coreference that extends eariler work in the area and outperforms the best existing learning-based coreference engine on a suite of standard coreference data sets. Performance gains accrue from more effective use of the available training data via a set of linguistic and extra-linguistic extensions to the standard machine learning framework for coreference resolution.
coreference; anaphora; coreference resolution; anaphora resolution; machine learning; natural language processing
dissertation or thesis