There’s a running joke that the majority of entertainment news websites now are, basically, screenshots of one of five other sites that continue to pump out stories. Everyone else passes it on second hand, ideally at least with some contribution or opinion on it. But usually with 400 words of Google-teasing guff before they get to the embedded Tweet. It was not always so, but you could be forgiven for thinking otherwise. For whilst the world wide web was originally underpinned by ideals that had amongst them democratising writing and material, what’s actually happening is a slow erosion of its archive. Don’t believe me? Go looking for a movie news story that was published in and around the turn of the millennium. The Wikipedia page for the superb animated film The Iron Giant actually is a good starting place. Head to the bottom of the page, to the long list of references. How many of those links there are still active? Quite a few. How many lead to a page that’s not the specific article that was originally there? Quite a few also. Notwithstanding attempts to archive old articles – that I’ll come to – quite a lot of the articles about the film have simply disappeared. Sometimes behind a paywall, but also, oftentimes, they’ve just gone. Kaput. Not to be found anywhere else.
This has been happening a lot, but barely anybody is talking about it.
It’s no secret that websites, just like other publications and outlets, come and go. Furthermore, websites go through sizeable restructures and redesigns, the technology underpinning them evolves, and back material is sacrificed as it’s ‘not compatible’. These are the accidental erasures, when a redesign loses years of comments or material, with few discernible ways of getting any of it back.
The sites that vanish altogether also take a lot of work with them, without leaving even an archive behind. Digital publishing simply doesn’t have the permanence of print, and anyone who tells you otherwise should be looked at with untrusting eyes.
In my past, I edited a weekly computer magazine for just over a decade. Most of the material ended up on its website. Said magazine closed down a few years ago, and the entire archive of articles has gone with it. Were it not for the print copies in my garage, and some unreliable-looking archive discs, the material would be gone forever. Even so, without an appointment at my house, you won’t be able to read it. Even then, I’d want a nice coffee.
For web journalists, this is now the perilous nature of the beast. One relatively prominent American film reporter I know has lost work several times when differing websites have merged/gone under/gone in a different direction. With the click of a button, the result of some decision way, way out of his hands, the online presence of his work has gone, and he’s not been able to get a lot of it back.
Much of this, to a degree, is known. What’s less known is that some websites are increasingly deleting older material, sacrificing it at the altar of Google. The ever-changing Google search algorithm – surely the rotten heart in the midst of online journalism – prioritises fast-loading websites. One way to get a website shifting quicker is to shed material from it. To give a smaller database of articles for it to host.
You’ll be heartened to know there’s a term for what this material is known as: “thin content”.
In this case, it might be duplicated material. It might be stuff that’s been syndicated from elsewhere. But it also might be news stories about long-released films. Of little immediate value to anyone, but also, nobody goes through old newspapers and deletes what’s on page 23 to save a bit of space. As archivists and historians will tell you, sometimes it’s what’s in the margins, or tucked away in small pieces that otherwise seem irrelevant, that leads to gold. At the very least, it’s part of the story of something. Yet generally algorithms are deciding what’s worth keeping and what it isn’t. Even on sites therefore that have been up and running for years, there are articles disappearing, to keep everything running that little bit quicker.
A fair trade off, some would argue. Heck, who hasn’t been cheesed off at the speed a site loads? Conversely, surely the reason to visit a website isn’t the pop-up banner, the request to send you alerts, the auto-playing video, the newsletter to sign up to, the advert that overlays your screen or the array of clickable ads inviting you to see what a woman who was once 18 now looks like now she’s 50. It’s for the article or video you wanted.
Is the answer to cut back on any of that? Is it heck. Instead, editorial archives are being pummeled, streamlined and stripped back, because that’s the stuff that’s harder to – here it comes – ‘monetise’.

The frontpage of the website archive.org
