OpenAI unveils model that can summarize books of any length

OpenAI unveils model that can summarize books of any length

The Transform Technology Summits start October 13th with Low-Code/No Code: Enabling Enterprise Agility. Register now!



OpenAI has developed an AI model that can summarize books of arbitrary length. A fine-tuned version of the research lab’s GPT-3, the model works by first summarizing small sections of a book and then summarizing those summaries into higher-level summaries, following a paradigm OpenAI calls “recursive task decomposition.”


Summarizing book-length documents could be valuable in the enterprise, particularly for documentation-heavy industries like software development. A survey by SearchYourCloud found that workers take up to eight searches to find the right document, and McKinsey reports that employees spend 1.8 hours every day — 9.3 hours per week, on average — searching and gathering job-related information.


“OpenAI believes that this is an effective ‘recipe’ that can be used to help humans supervise many other tasks,” a spokesperson told VentureBeat via email. “A scalable solution to the alignment problem needs to work on tasks that are difficult or time-consuming for humans to evaluate.”


AI-powered summarization


OpenAI is far from the first to apply AI to the problem of summarization. Startups like Primer use machine learning techniques to help parse and collate a large number of documents across several languages. Google has investigated summarization methods that can generate abstract summaries of paragraphs — as has Microsoft. And Facebook is reportedly developing an AI tool that summarizes news articles so that users don’t have to read them.


OpenAI’s new model builds on the company’s previous research, which found that training a model with reinforcement learning from human feedback helped to align model summaries with people’s preferences on short posts and articles. Reinforcement learning entails training a system to perform a task — for example, summarizing text — by rewarding desired behaviors and/or punishing undesired ones.


To create the model, OpenAI combined reinforcement learning with recursive task decomposition, which procedurally breaks up a difficult task (e.g., summarizing a long piece of text) into simpler, individual ones (e.g., summarizing several shorter pieces). This decomposition allows humans to evaluate the model’s summaries quickly by using summaries of smaller parts of books. Moreover, it enables the model to summarize books of any length, from tens of pages to hundreds or thousands.


OpenAI book summaries

Above: OpenAI claims its new model, a fine-tuned version of GPT-3, can summarize books like Alice in Wonderland.

Image Credit: OpenAI


OpenAI trained the model on a subset of the books in GPT-3’s training dataset, which were mostly of the fiction variety and contained over 100,000 words on average. To evaluate the model, the lab’s researchers took the 40 most popular books published in 2020 (according to Goodreads) and assigned two people to read each book and write a summary, and then to rate summaries from both the model and each other.


While the model successfully generated “book-level” summaries containing much of the important information, it also sometimes generated inaccurate statements due to a lack of context, OpenAI concedes in a paper. Moreover, the model’s summaries often read more as a list of events from the book rather than a coherent summary, revealing the limitations of task decomposition. Task decomposition assumes that separate parts of a task can be completed independently, a rule that may not be true for summarizing books. For example, it might be hard to catch cases where earlier details in the book are only later revealed to be important, as is the true of mystery books.


“This work is part of our ongoing research into aligning advanced AI systems, which is key to our mission,” OpenAI researchers Jeffrey Wu, Ryan Lowe, and Jan Leike wrote in a blog post. “Our progress on book summarization is the first large-scale empirical work on scaling alignment techniques. Going forward, we are researching better ways to assist humans in evaluating model behavior, with the goal of finding techniques that scale to aligning artificial general intelligence.”


OpenAI hasn’t provided the source code or training dataset for the model. We’ve reached out to the company to see when — or if — it plans to make these public.

VentureBeat


VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member