"Content" betrays our trust
The tyranny of the SEO keyword and the weird relationship between Google and reddit
“How can you even cheat at chess?”
That was my first thought when I heard about the cheating scandal surrounding Hans Niemann, a teenage chess player who defeated Magnus Carlsen, one of the most feted chess players of all time. Chess is a game of complete information: No can hide their pieces or moves. It’s not like Yu-Gi-Oh!, a card game in which you can cheat by marking powerful cards so that you know where they are in the deck, or by bypassing or repeating certain actions governed by the games’ complex, multi-phase rules.
It turns out that you can cheat, by some combination of:
Memorizing the moves that a computer would make given the same scenario.
Using various aides such as a phone or computer to get a signal about what move to make.
Toggling to a second screen during play to more easily see how different moves may look and/or play out.
However, these approaches are all complicated and often difficult to either prove or even explain. That makes them bad candidates for #clickable #content.
Click this content, please
But you know what’s way easier to explain and to get people to click on? The theory that Niemann was using vibrated, wirelessly networked anal beads to get Morse code signals about what moves to make. Yes.
The reams of eminently disposable content about these entirely theoretical anal beads had a lot of fun with the subject matter—and got a lot of engagement—while offering no proof of Niemann actually using them.
The original source for the accusation was fittingly flimsy: a joke in someone’s Twitch stream. Yet the story became an overnight SEO sensation, as well as a bottomless well of inspiration for the types of lengthy, pseudoscientific comments that have become Reddit’s trademark.
The first Reddit post to really run with this theory treads the line between parody and serious, condescending explanation, beginning characteristically with “the real answer is actually elementary.”
Others went further with the earnestness and acted like the inherent silliness of Neimmann’s theoretical tactic was beside the point—that the beads were obviously just another iteration of the cheating in chess that the post author assumes has always been happening (again, specific proof isn’t even offered).
Overall, this story is a good example of something being simply “too good to check.” It was so sensational that journalists content writers couldn’t pass up the chance to write about it. But it’s best understood as the symptom of something more sinister: the tyranny of the SEO keyword, and how SEO content interacts with reddit.
SEO keywords and the interchangeability of “content”
Think of how you search for something on Google. What do you type?
Let’s say you’re looking for instructions on how to fix a kitchen window. You know what the window looks like, but know the name of neither the window type nor the window parts you need.
So you try out “fix windows,” but quickly find that all the top results for that are about Microsoft Windows. You pivot to “fix window” and get bombarded with ads and then some window repair company homepage links. A lone Home Depot guide is the only result on page one that offers any practical advice.
After some scouring around, you finally figure out that what you need to know if how to fix a “double hung window.” From there, you query “double hung window diagram” and get to know all about shoes and pivot rods.
Along the way, each one of these search queries has its own set of ads and strangely similar “content” that get returned. Even if the highly specific search string “how to fix double hung window that won’t stay up” (Google autosuggests this if you try “fix double hung…”), the three top-ranking articles both start with some form of “hey so it’s really annoying when your double hung window won’t stay up, isn’t it?!” and then go into almost identical lists of required steps.
What are keywords, anyway?
They also share SEO keywords, the technical term of the search queries that you enter into a search engine such as Google. “Won’t stay up,” “balancing mechanisms,” “pivot bars” and “double-hung window” are all pivotal here.
Keyword research is a core part of SEO and by extension the entire act of writing for a content mill. When it’s not being forced on you by an opaque algorithm such as on TikTok, “content” is often discovered by entering keywords. So as a “content creator,” you want to ensure that what you’re making can rank highly for these words.
Unfortunately for your readers (but perhaps fortunately for you), your content may get surfaced even if the search term doesn’t appear anywhere in your content. That’s because:
Search engines “know” which keywords are semantically related to each other. They may serve you content about cloud access security brokers (CASBs) when you searched instead for “private cloud broker.” In the image below, “private cloud” appears nowhere on the page of the Netskope search result.
Google is too smart for its own good. In addition to aggressive autocomplete suggestions, it “thinks” it knows what you want even if you try. When I tried to figure out a niche issue about whether the LG C1 TV could pass through 5.1 PCM audio from the Nintendo Switch to a sound bar, most of the results were instead for the more common LG CX—a totally different model.
I was eventually able to figure out the answer to this question, but only by trying some different configurations in real life.
SEO keyword research software helps you find the semantically related queries and easily determine how many times you should mention them in your content. Some content mills assign tiers to their keywords, advising 10+ mentions for Tier 1, 5-10 for Tier 2, and 1-4 for Tier 3.
Even though Google’s algorithms are proprietary, the SEO industry has excelled at reverse-engineering how it works, and with the proliferation of SEO-specific software, it’s no wonder that a lot of content looks the same. It’s interchangeable, and literally formulaic: Find the keywords, add them, and then score the piece before it gets published to ensure it can rank. Rinse and repeat as needed when updating the piece to be more timely, for example, if a key fact changed.
Those similar-sounding and almost identically structured double hung window articles, like the reams of same-y stories about the chess scandal, are a symptom of the overall interchangeability of “content.” Going back to the chess example, a search for “chess scandal” as of Nov. 4, 2022 gives you lots of stuff with “beads” in it even if you didn’t type that; the Google search interface also suggests adding “beads” and “reddit” to the query. It wants to serve you something from that enormous cache of “can you use anal beads to cheat at chess? Hmm, maybe…let’s find out after three paragraphs of keyword-stuff introductions” posts.
The interchangeability on display here is perhaps even more evident in the case of streaming video.
In my inaugural post, I had lamented “the gray sludge of almighty ‘content,’” and later I harped on how “streaming services are built to overwhelm.” Nate Levy expressed similar sentiments with he talked about how movies and TV shows had become interchangeable cogs in a content machine rather than pieces of art with individual identities.
The creation of this Great Content Heap inevitably requires lots of recycling.
As discussed on our podcast about managerial coping strategies at content mills, content writers are under immense pressure to churn out multiple stories per day, which means scouring Google for press releases and industry blogs that can be reworded without having to add any original research or reporting. Similarly, the TV and film industry that Levy is discussing has fallen back on franchise-related content that doesn’t require much in the way of new characters or approaches to cinematography. Everything on Netflix literally looks the same, from the types of cameras its shot on and the common compression artifacts you’ll notice, to the franchise-y and revivalist nature of it all.
So where do oddball ideas that would otherwise never make it through the content machine come from? Often, as is the case with this chess cheating scandal scenario, the answer is reddit.
Reddit: The first draft of the internet
If you’re not familiar, reddit is essentially a megaforum, covering every imaginable topic. It began in the mid 2000s and has remained remarkably resilient despite the huge changes in how people use the internet between then and now.
Reddit is often billed as “the front page of the internet,” a metaphor that hearkens back to a time when newspapers were still culturally and economically powerful enough to set talking points for the day. But to me it’s always seemed more like the first draft of the internet. Stories that pop up later in publications such as Gizmodo, The New Republic, and The Verge, to say nothing of numerous SEO tips articles that rely on questions and answers in reddit discussions, often began with someone’s idea or tip on Reddit.
This Verge story on how Xbox Cloud Gaming runs worse on Linux, and what you can do to work around it, would’ve been tough to write without the examples from a gaming subreddit. And the chess #beads content wouldn’t have achieved escape velocity without the credibility bestowed on it by lengthy reddit diatribes comments on the subject.
Why is reddit so instrumental in how these stories, and so much of the SEO content are shaped? My theory is that reddit is one of the only high-traffic sites remaining where you can someone’s direct opinion or knowledge, unadulterated with any SEO markers or fake-sounding LinkedIn hustle language.
Why reddit is only going to grow stronger
Reddit launched in 2005, when I was just beginning college. It had nowhere near the breakthrough popularity that Facebook enjoyed at that time (my campus was one of the first to get access). Over the years, it waxed and waned in the public eye—gradually becoming a major source of referral links, but also repelling plenty of people with its reputation as a toxic haven for know-it-alls.
For every spurious SEO seedling like the beads story that it planted into the SEO seedbed, reddit (and smaller forums with similar structure) also serves up invaluable tips that are tough to find through search engines. Reddit was by far the best source I found when trying to figure out how to get surround sound from a Nintendo Switch (watch for a very in the weeds post on this coming up soon), and it’s also the most reliable forum for advice on topics such as hair loss, which by contrast are just a haven of SEO spam and advertising garbage on Google.
SEO as well as emerging tools such as “AI”-powered chatbots both exist in part because of the illusion that machines always have the answers that humans want. But they don’t, and as such, why should we waste our time “talking” to them and trying to “please”? I mean, here’s Google’s featured snippet for “Does Nintendo Switch work with eARC?” (It absolutely does)
All of the SEO optimization that got this post into this box, plus all of the computer science underpinning Google, served up an answer that’s worse that something any random human with internet access and no real SEO knowledge of their own could’ve answered more correctly. And lots of those people are on reddit.
So I’m bullish on #human #content, to use a phrase that makes me cringe. The likes of reddit et al. do indirectly give us some terrible, untrustworthy stories, but that’s only because of the bad incentives and overall dumbness of machine-dominated SEO infrastructure. Well-curated and moderated forums are far superior sources of knowledge to searching Google—before Elon Musk bought it, Twitter was (and still is, barely) the world’s greatest open-ended chat room/forum, because it was a stream of human thoughts unmediated by SEO.
With SEO, you get misleading garbage like the chess beads scandal. Without it, you get to learn why waiting on a hair loss cure-all probably isn’t a good idea.