06.04.2024 18:36 Uhr, Quelle: Engadget
OpenAI and Google reportedly used transcriptions of YouTube videos to train their AI models
OpenAI and Google trained their AI models on text transcribed from YouTube videos, potentially violating creators’ copyrights, according to The New York Times. The report, which describes the lengths OpenAI, Google and Meta have gone to in order to maximize the amount of data they can feed to their AIs, cites numerous people with knowledge of the companies’ practices. It comes just days after YouTube CEO Neal Mohan said in an interview with Bloomberg Originals that OpenAI’s alleged use of YouTube videos to train its new text-to-video generator, Sora, would go against the platform’s policies.
According to the NYT, OpenAI used its Whisper speech recognition tool to transcribe more than one million hours of YouTube videos, which were then used to train GPT-4. The Information previously reported that OpenAI had used YouTube videos and podcasts to train the two AI systems. OpenAI president Greg Brockman was reportedly among the people on this team. Per Google’s rules, “unauthori
Weiterlesen bei Engadget