{"id":2831,"date":"2025-06-16T20:38:21","date_gmt":"2025-06-16T20:38:21","guid":{"rendered":"https:\/\/musictechohio.online\/site\/chatgpt-polluted-ruined-ai-development\/"},"modified":"2025-06-16T20:38:21","modified_gmt":"2025-06-16T20:38:21","slug":"chatgpt-polluted-ruined-ai-development","status":"publish","type":"post","link":"https:\/\/musictechohio.online\/site\/chatgpt-polluted-ruined-ai-development\/","title":{"rendered":"ChatGPT Has Already Polluted the Internet So Badly That It&#8217;s Hobbling Future AI Development"},"content":{"rendered":"<div>\n<div><img width=\"2400\" height=\"1260\" src=\"https:\/\/wordpress-assets.futurism.com\/2025\/06\/chatgpt-polluted-ruined-ai-development.jpg\" class=\"attachment-full size-full wp-post-image\" alt=\"There may be no undoing the vast amounts of pollution wreaked by ChatGPT. And that's just tough luck for any AI models that come after it.\" style=\"margin-bottom: 15px;\" decoding=\"async\" fetchpriority=\"high\"><\/div>\n<p><span style=\"font-weight: 400;\">The rapid rise of ChatGPT \u2014 and the cavalcade of competitors&#8217; generative models that followed suit \u2014 has polluted the internet with so much useless slop that it&#8217;s already kneecapping the development of future AI models.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">As the AI-generated data clouds the human creations that these models are so heavily dependent on amalgamating, it becomes inevitable that a greater share of what these so-called intelligences learn from and imitate is itself an ersatz AI creation.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Repeat this process <\/span><span style=\"font-weight: 400;\">enough, and AI development begins to resemble a maximalist game of telephone in which not only is the quality of the content being produced diminished, resembling less and less what it&#8217;s originally supposed to be replacing, but in which the participants <a style=\"font-weight: 400;\" href=\"https:\/\/futurism.com\/ai-industry-problem-smarter-hallucinating\">actively become stupider<\/a>. The industry likes to describe this scenario as AI &#8220;<\/span><a style=\"cursor: pointer !important; user-select: none !important;\" href=\"https:\/\/futurism.com\/the-byte\/ai-trained-with-ai-generated-data-gibberish\"><span style=\"font-weight: 400;\">model collapse<\/span><\/a><span style=\"font-weight: 400;\">.&#8221;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">As a consequence, the finite amount of data predating ChatGPT&#8217;s rise becomes extremely valuable. In a <\/span><a href=\"https:\/\/www.theregister.com\/2025\/06\/15\/ai_model_collapse_pollution\/\"><span style=\"font-weight: 400;\">new feature<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><i><span style=\"font-weight: 400;\">The Register <\/span><\/i><span style=\"font-weight: 400;\">likens this to the demand for &#8220;low-background steel,&#8221; or steel that was produced before the detonation of the first nuclear bombs, starting in July 1945 with the US&#8217;s Trinity test.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Just as the explosion of AI chatbots has irreversibly polluted the internet, so did the detonation of the atom bomb release radionuclides and other particulates that have seeped into virtually all steel produced thereafter. That makes modern metals unsuitable for use in some highly sensitive scientific and medical equipment. And so, what&#8217;s old is new: a major source of low-background steel, even today, is WW1 and WW2 era battleships, including a huge naval fleet that was scuttled by German Admiral Ludwig von Reuter in 1919.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Maurice Chiodo, a research associate at the Centre for the Study of Existential Risk at the University of Cambridge called the admiral&#8217;s actions the &#8220;greatest contribution to nuclear medicine in the world.&#8221;\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">&#8220;That enabled us to have this almost infinite supply of low-background steel. If it weren&#8217;t for that, we&#8217;d be kind of stuck,&#8221; he told <\/span><i><span style=\"font-weight: 400;\">The Register<\/span><\/i><span style=\"font-weight: 400;\">. &#8220;So the analogy works here because you need something that happened before a certain date.&#8221;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">&#8220;But if you&#8217;re collecting data before 2022 you&#8217;re fairly confident that it has minimal, if any, contamination from generative AI,&#8221; he added. &#8220;Everything before the date is &#8216;safe, fine, clean,&#8217; everything after that is &#8216;dirty.'&#8221;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In 2024, Chiodo co-authored a <a href=\"https:\/\/papers.ssrn.com\/sol3\/papers.cfm?abstract_id=5045155\">paper<\/a> arguing that there needs to be a source of &#8220;clean&#8221; data not only to stave off model collapse, but to ensure fair competition between AI developers. Otherwise, the early pioneers of the tech, after ruining the internet for everyone else with their AI&#8217;s refuse, would boast a massive advantage by being the only ones that benefited from a purer source of training data.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Whether model collapse, particularly as a result of contaminated data, is an imminent threat is a matter of some debate. But many researchers have been <\/span><a href=\"https:\/\/futurism.com\/the-byte\/ai-running-out-data-smarter\"><span style=\"font-weight: 400;\">sounding the alarm<\/span><\/a><span style=\"font-weight: 400;\"> for years now, including Chiodo.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">&#8220;Now, it&#8217;s not clear to what extent model collapse will be a problem, but if it is a problem, and we&#8217;ve contaminated this data environment, cleaning is going to be prohibitively expensive, probably impossible,&#8221; he told <\/span><i><span style=\"font-weight: 400;\">The Register<\/span><\/i><span style=\"font-weight: 400;\">.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">One area <\/span><span style=\"font-weight: 400;\">where the issue has already reared its head is with the technique called retrieval-augmented generation (RAG), which AI models use to supplement their dated training data with information pulled from the internet in real-time. But this new data isn&#8217;t guaranteed to be free of AI tampering, and <\/span><a href=\"https:\/\/futurism.com\/ai-models-falling-apart\"><span style=\"font-weight: 400;\">some research<\/span><\/a><span style=\"font-weight: 400;\"> has shown that this results in the chatbots producing far more &#8220;unsafe&#8221; responses.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The dilemma is also reflective of the broader debate around <a href=\"https:\/\/futurism.com\/ai-researchers-tech-industry-dead-end\">scaling<\/a>, or improving AI models by adding more data and processing power. After OpenAI and other developers reported diminishing returns with their <\/span><a href=\"https:\/\/futurism.com\/the-byte\/openai-diminishing-returns\"><span style=\"font-weight: 400;\">newest models in late 2024<\/span><\/a><span style=\"font-weight: 400;\">, some experts proclaimed that scaling had hit a &#8220;wall.&#8221; And if that data is increasingly slop-laden, <\/span><span style=\"font-weight: 400;\">the wall would become\u00a0that much more impassable.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Chiodo speculates that stronger regulations like labeling AI content could help &#8220;clean up&#8221; some of this pollution, but this would be difficult to enforce. In this regard, the AI industry, which has <a href=\"https:\/\/futurism.com\/the-byte\/openai-eu-sam-altman\">cried foul<\/a> at any <a href=\"https:\/\/futurism.com\/openai-over-copyrighted-work\">government interference<\/a>, may be its own worst enemy.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">&#8220;Currently we are in a first phase of regulation where we are shying away a bit from regulation because we think we have to be innovative,&#8221; Rupprecht Podszun, professor of civil and competition law at Heinrich Heine University D\u00fcsseldorf, who co-authored the 2024 paper with Chiodo, told <\/span><i><span style=\"font-weight: 400;\">The Register<\/span><\/i><span style=\"font-weight: 400;\">. &#8220;And this is very typical for whatever innovation we come up with. So AI is the big thing, let it go and fine.&#8221;<\/span><\/p>\n<p><strong>More on AI: <\/strong><em><a href=\"https:\/\/futurism.com\/openai-altman-electricity-ai\">Sam Altman Says &#8220;Significant Fraction&#8221; of Earth&#8217;s Total Electricity Should Go to Running AI<\/a><\/em><\/p>\n<p>The post <a href=\"https:\/\/futurism.com\/chatgpt-polluted-ruined-ai-development\">ChatGPT Has Already Polluted the Internet So Badly That It&#8217;s Hobbling Future AI Development<\/a> appeared first on <a href=\"https:\/\/futurism.com\/\">Futurism<\/a>.<\/p>\n<\/div>\n<div style=\"margin-top: 0px; margin-bottom: 0px;\" class=\"sharethis-inline-share-buttons\" ><\/div>","protected":false},"excerpt":{"rendered":"<p>The rapid rise of ChatGPT \u2014 and the cavalcade of competitors&#8217; generative models that followed suit \u2014 has polluted the internet with so much useless slop that it&#8217;s already kneecapping&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1256,177,196,183,179],"tags":[],"class_list":["post-2831","post","type-post","status-publish","format-standard","hentry","category-ai-models","category-artificial-intelligence","category-chatgpt","category-generative-ai","category-openai"],"_links":{"self":[{"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/posts\/2831","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/comments?post=2831"}],"version-history":[{"count":0,"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/posts\/2831\/revisions"}],"wp:attachment":[{"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/media?parent=2831"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/categories?post=2831"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/tags?post=2831"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}