Modeling Legal Constructs
The rise of language models (LMs) has sparked excitement about their potential to transform interactions with the law, but they still face serious challenges, including a lack of construct validity, information accuracy, and explainability. These challenges are heightened in high-risk domains like the law, where precision, explanation, and context are crucial. During model evaluation, quantitative performance on benchmark datasets is often prioritized over qualitative assessment and the incorporation of contextual legal knowledge. This dissertation assesses the use of LMs for annotating legal documents to gain insight into the history of legal writing and the law. First, I map the landscape of legal natural language processing (NLP) and highlight how technical challenges are compounded by limited interdisciplinary engagement between NLP and legal scholarship. Then, through two case studies --- modeling legal rhetoric and modeling legal reasoning --- I demonstrate how LMs can be used effectively to analyze legal texts. In the second case study on legal reasoning, I illustrate the limits of LMs on tasks that are especially abstract and domain-specific. These findings underscore the continued importance of legal expertise in modeling legal constructs, particularly in selecting relevant tasks, building meaningful datasets, and rigorously evaluating construct validity.