Common Crawl Faces Backlash from Publishers Over AI Training Data
Publisher |
Dr. Tony Hoang
Media Type |
audio
Categories Via RSS |
Technology
Publication Date |
Jun 13, 2024
Episode Duration |
00:04:28

Common Crawl, a nonprofit web archive, is facing backlash from publishers, including Danish media outlets, over its role in AI training data. The publishers are demanding that Common Crawl remove copies of their articles from past data sets and stop crawling their websites. Common Crawl plans to comply, citing its inability to engage in costly legal battles. This controversy has significant implications for academic research, which heavily relies on Common Crawl's data sets, and raises concerns about the future of innovation in the AI field.

--- Send in a voice message: https://podcasters.spotify.com/pod/show/tonyphoang/message

This episode currently has no reviews.

Submit Review
This episode could use a review!

This episode could use a review! Have anything to say about it? Share your thoughts using the button below.

Submit Review