Microsoft deletes blog telling users to train AI on pirated Harry Potter books
February 20, 2026
Microsoft recently faced significant backlash after publishing a now-deleted blog post that suggested developers use pirated Harry Potter books to train AI models. Authored by senior product manager Pooja Kamath, the post aimed to promote a new feature for integrating generative AI into applications and linked to a Kaggle dataset that incorrectly labeled the books as public domain. Following criticism on platforms like Hacker News, the blog was removed, revealing the risks of using copyrighted material without proper rights and the potential for AI to perpetuate intellectual property violations. Legal experts expressed concerns about Microsoft's liability for encouraging such practices, emphasizing the blurred lines between AI development and copyright law. This incident highlights the urgent need for ethical guidelines in AI development, particularly regarding data sourcing, to protect authors and creators from exploitation. As AI systems increasingly rely on vast datasets, understanding copyright laws and establishing clear ethical standards becomes crucial to prevent legal repercussions and ensure responsible innovation in the tech industry.