Callison-Burch, ChristopherEck, DouglasIppolito, Daphne, Eden2023-11-222023-11-2220222022https://repository.upenn.edu/handle/20.500.14332/589842022State-of-the-art neural language models are capable of generating incredibly fluent English text.This success provides opportunities for novel forms of interaction, where human writers work collaboratively with a natural-language generation system toward a set of goals. However, it also poses several challenges. Evaluating and comparing the skill of different open-ended text generation systems is challenging, and generated text can have negative societal impact if it proliferates and is not detectable by humans. In this dissertation, I introduce a detection-based evaluation task that can be used to compare different language models and generative configuations. By both asking humans to complete this task and training automatic classifier to complete it, I investigate how the tradeoff between generating high-quality and generating diverse text impacts detectability. Through subsequent large-scale user studies, I show that factors such as the model size and the topic of the generation can have significant influence on human detection capabability. I show how large neural language models' capability of memorizing large swaths of their training data complicates our ability to evaluate their skill at generating high-quality novel text. I also show how, despite these challenges, neural language models can be successfully employed to support creative writing tasks. In particular, I describe methods for performing style transfer into any user-provided style and for efficiently supporting fill-in-the-blank operations in addition to the more standard continuation operation. Finally, I introduce an interactive writing tool we built which allows creative writers to collaborate with a natural language generation system to craft stories. Users studies with both novice and professional writers provide insights into the strengths and limitations of applying natural language generation systems in real-world settings.enComputer Scienceslanguage modelingmachine learningnatural language generationnatural language processingUnderstanding the Limitations of Using Large Language Models for Text GenerationDissertation/Thesis2023-11-22