New nonprofit to certify AI models that only use copyrighted training data with permission


A new nonprofit will issue certifications to artificial intelligence companies that don’t include copyrighted materials in their training datasets unless they obtain permission to do so.

The organization, Fairly Trained, launched today. It was founded by Ed Newton-Rex, the former vice president of audio at well-funded generative AI startup Stability AI Ltd. Newton-Rex reportedly left the company in November after disagreeing with its stance on copyrighted training data.

Several generative AI companies, including OpenAI, have argued that training a neural network on copyrighted materials constitutes fair use. That practice has drawn criticism and, in some cases, legal action from copyright holders. Last month, The New York Times sued OpenAI for allegedly training its language models on millions of articles without permission.

Fairly Trained’s work is supported by a four-person advisory committee. The committee’s members include Siri co-creator Tom Gruber, attorney Elizabeth Moody, composer Max Richter and Maria Pallante, the Association of American Publishers’ chief executive officer.

Obtaining a certification from Fairly Trained doesn’t require AI developers to avoid using copyrighted data altogether. However, they must obtain a license from the copyright holder before doing so. A training dataset may also include materials owned by the AI developer as well as content that is distributed under an open-source license or is in the domain domain.

There are two other prerequisites to obtaining a certification. First, AI developers must implement a due diligence workflow for ensuring the training data they collect from external sources is not subject to copyright restrictions. Second, they are required to create a database that logs the records used in each AI training project.

“As our first certification, we don’t expect it to solve all the issues for creators that generative AI training raises,” Fairly Trained detailed in a blog post today. “But we hope that it highlights that there is a meaningful difference between generative AI companies that license training data and those that use data without consent.”

Applying to a Fairly Trained certification costs $500 for companies that generate more than $10 million in annual revenue. If the application is accepted, there’s a $6,000 yearly fee. AI developers with annual revenues of less than $10 million can obtain a certification for as little as $500 per year.

Ahead of its launch today, Fairly Trained issued certifications to nine generative AI startups. Eight provide tools for creating music and other audio assets, while one offers an image generation service.

Fairly Trained plans to roll out additional certifications in the future. According to the nonprofit, those certifications will encompass more aspects of AI developers’ training data sourcing practices, including whether they provide opt-out options for copyright holders. OpenAI, Stability AI and a number of other companies already offer such an option. 

Photo: Unsplash

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy



Leave a Reply

Your email address will not be published. Required fields are marked *