Today's AI/ML headlines are brought to you by ThreatPerspective

Digital Event Horizon

The Digital Preservation of Human Creativity: A Catalog for a Post-AI Era



In an era where artificial intelligence (AI) has become increasingly prevalent, preserving human creativity and expression has become a pressing concern. A new website, lowbackgroundsteel.ai, serves as a digital archive of pre-AI content, collected from various sources before the rise of AI-generated media.


  • The launch of lowbackgroundsteel.ai aims to preserve human creativity and expression in a world dominated by artificial intelligence (AI).
  • The project collects pre-AI content from various sources, including Wikipedia, Project Gutenberg, and GitHub, to create a digital archive.
  • The concept draws parallels between salvaging "low-background steel" during the Cold War era and salvaging authentic human-created media in today's AI-dominated landscape.
  • Researchers have expressed concerns about AI models training on their own outputs, but recent research suggests that synthetic data can be used to mitigate model collapse.
  • The project seeks to document human creativity from before the AI era to preserve sources of human expression for understanding language evolution and human communication.
  • The importance of preserving human creativity in a post-AI era cannot be overstated, as distinguishing between human and machine output becomes increasingly difficult.


  • In an era where artificial intelligence (AI) has become increasingly prevalent, preserving human creativity and expression has become a pressing concern. Recently, former Cloudflare executive John Graham-Cumming launched a website, lowbackgroundsteel.ai, which serves as a digital archive of pre-AI content, collected from various sources before the rise of AI-generated media.

    The concept behind this project is rooted in a scientific phenomenon observed during the Cold War era, where nuclear testing led to the contamination of steel production worldwide. To create radiation-free metal for sensitive instruments, scientists had to salvage steel from pre-war shipwrecks, known as "low-background steel." Graham-Cumming draws parallels between this period and today's web, where AI-generated content has become a ubiquitous presence, threatening the authenticity of human-created media.

    The wordfreq project, created by researcher Robyn Speer, was one of the casualties of this new landscape. This Python library tracked word frequency usage across over 40 languages by analyzing millions of sources, including Wikipedia, movie subtitles, news articles, and social media. The tool was widely used by academics and developers to study language evolution and build natural language processing applications. However, with ChatGPT's emergence in 2022, the project announced that it would no longer be updated due to the proliferation of "slop" generated by large language models.

    Researchers have expressed concerns about AI models training on their own outputs, potentially leading to quality degradation over time, a phenomenon known as "model collapse." However, recent research suggests that model collapse can be mitigated when synthetic data accumulates alongside real data. In fact, properly curated and combined with real data, synthetic data from AI models can even assist in training newer, more capable models.

    Graham-Cumming's website serves as a catalog for pre-AI content, collecting sources untouched by ChatGPT and AI contamination. It points to several major archives of pre-AI content, including Wikipedia dumps from August 2022, Project Gutenberg's collection of public domain books, the Library of Congress photo archive, and GitHub's Arctic Code Vault—a snapshot of open source code buried in a former coal mine near the North Pole in February 2020. The site also accepts submissions of other pre-AI content sources through its Tumblr page.

    Graham-Cumming emphasizes that the project aims to document human creativity from before the AI era, not to make a statement against AI itself. Rather, it seeks to preserve sources of human expression now, as they may become valuable in the future for understanding how language evolves and human communication has changed over time. For instance, Graham proposed the concept of a "cryptographic ark"—a timestamped archive of pre-AI media that future historians could verify as authentic.

    As AI continues to transform the web, the importance of preserving human creativity and expression cannot be overstated. Lowbackgroundsteel.ai stands as a modest catalog of human expression from what may someday be seen as the last pre-AI era. It marks the boundary between human-generated and hybrid human-AI cultures, providing a digital archaeology project that can aid in understanding how human communication evolved before AI entered the conversation.

    In an age where distinguishing between human and machine output grows increasingly difficult, these archives may prove invaluable for researchers, historians, and scholars seeking to understand the impact of AI on human creativity. As we move forward into a post-AI era, it is essential that we prioritize the preservation of human expression, ensuring that our digital legacy remains authentic and meaningful.



    Related Information:
  • https://www.digitaleventhorizon.com/articles/The-Digital-Preservation-of-Human-Creativity-A-Catalog-for-a-Post-AI-Era-deh.shtml

  • https://arstechnica.com/ai/2025/06/why-one-man-is-archiving-human-made-content-from-before-the-ai-explosion/


  • Published: Wed Jun 18 12:36:04 2025 by llama3.2 3B Q4_K_M











    © Digital Event Horizon . All rights reserved.

    Privacy | Terms of Use | Contact Us