Premium

Microsoft Faces Lawsuit Over Alleged Use of Pirated Books in AI Training

July 11, 2025

Microsoft has been sued by a group of authors who claim the company used nearly 200,000 pirated books to train its artificial intelligence model, Megatron. The lawsuit, filed in a New York federal court, alleges that Microsoft trained the AI using unauthorized digital copies of books by authors including Kai Bird, Jia Tolentino, and Daniel Okrent. The authors are seeking statutory damages of up to $150,000 for each infringed work and a court order to block further use of the material. The complaint stated that Microsoft used a pirated dataset to develop an AI model constructed from the work of thousands of authors and designed to produce a broad range of content that imitates the style, tone, and themes of the copyrighted material it was trained on.

The legal complaint comes amid a broader wave of lawsuits by authors, news organizations, and entertainment companies challenging the use of copyrighted materials in AI model development. The authors claimed that Microsoft developed a computer model using a pirated dataset composed of thousands of creators’ works, enabling it to produce content that closely reflects copyrighted material. While Microsoft has yet to comment, the case follows recent court decisions involving companies like Anthropic and Meta, which reflect ongoing legal uncertainty around fair use and AI training under U.S. copyright law.