My Infinite Website

I have a grudge against spammers, SEO, and similar low-lifes. When and where I run Internet-facing web servers, I send “SEO search engines” down an infinite rabbit hole of web pages and images.

Spammers have ruined many valuable communications networks (FIDO, Usenet, PSTN, Faxes, email) for small, personal, gain. They’re basically guilty of crimes against humanity.

“Search Engine Optimizations”, or SEO, is the practice of gaming Google, and Bing through more-or-less unethical means to get Google and Bing to tell searchers about whatever rubbish you want to sell, or your scams, instead of what they’re looking for. SEO people are the main reason why Google and Bing search has become less worthwhile. That and Google letting advertisers dictate search results.

That’s why I have a grudge.

The thumbnail for this post is a randomly-generated PNG image from a program I use to mess with low-life internet bottom-feeders.

Source code here

The program, written in dodgy open source language PHP can produce text files (like robots.txt), structured text (HTML and JSON), images (in PNG, GIF and JPEG formats), and random bytes. It can produce randomly-generated websites when run from a properly-configured web server. The HTML it randomly generates contains links to other “web pages” that can also be randomly-generated when the time comes to serve them.

Just to be a wiseacre, it also tries to set some random cookies in whatever browser accesses it.

Try it

These should work from the command line on a Linux (of course!) machine that has PHP and a few very common PHP packages installed, mainly GD.

$  SCRIPT_URL=something.gif php ./bork.php > random.gif
$  SCRIPT_URL=/robots.txt php ./bork.php > random.robots.txt
$  SCRIPT_URL=random.html php ./bork.php > random.html
$  SCRIPT_URL=garbage.json php ./bork.php > random.json

Invoked under it’s own name, it produces randomly-generated HTML that includes some Latin, but also references a lot of B- and C-list celebs of the 2010s, and mentions condiments and underwear.

Try it!

Use it

Somewhere, you’ll have to load mod_rewrite module.

LoadModule rewrite_module modules/mod_rewrite.so

Somewhere, you’ll have to turn on the rewrite module:

RewriteEngine on

I found the mod_rewrite install and config to be confusing on CentOS installations. You’re kind of on your own there.

After that, you can have your Apache web server send random crap to requests from particular user agents like this:

RewriteCond %{HTTP_USER_AGENT} ^.*Bytespider.*
RewriteRule ^(.*)$ /bork.php [L]

Anyway, that RewriteCond sends every request from notoriously ill-mannered “Bytespider” crawler, no matter what URL, to the bork.php script.

Requests for robots.txt, index.html, and just about any image format, from Bytespider will all get randomly-generated garbage.

Here’s another way to treat some low-lifes correctly:

RewriteCond %{QUERY_STRING} .*XDEBUG_SESSION_START=phpstorm
RewriteRule ^(.*)$ /bork.php [L]

Lots of vulnerability scanners ask for something containing the string “XDEBUG_SESSION_START=phpstorm”. That rewrite condition causes bork.php to send back a randomly-chosen number of randomly-chosen bytes. I have no idea what some program that asks for “phpstorm” expects, but it’s going to get gibberish.

Effects

BLEXbot requests per day

This is how BLEXbot reacted on one of my websites. About 2018-12-01, I set up my web server to give any request from BLEXbot a randomly-generated file. BLEXbot went from about 10 requests a day to over 1000 requests a day, all of them completely useless crap.

Sometime after 2020-01-01, BLEXbot requests dropped back to the 10-a-day range. It’s my belief that somebody noticed that one web server offered a nearly infinite profusion of info, they took a look, and instituted some kind of fix.

Consequences

Your web server log files will fill up with long, horrible URLs. Some poorly-written crawlers (Ahrefs, BLEXbot, Yandex) will just go berserk, asking for tens of thousands of faked images, .torrent files and gibberish HTML pages.

This is dual-use technology, so you have to expect some blow-back.

Someone else is doing something similar

John Levine, one of the OG Morlocks of the internet, notes that he has the world’s lamest content farm.](https://mailman.nanog.org/pipermail/nanog/2024-April/225407.html).

The world’s lamest content farm looks like it has a template for a web page with 9 links. Each link shows a randomly-gnerated human name. The URLs linked to are based on those human names.

When I opened world’s lamest content farm.](https://mailman.nanog.org/pipermail/nanog/2024-April/225407.html) I got some links including one to Petra Cody Carlene. Petra Cody Carlene’s server is at “petra-cody-carlene.web.sp.am”. That is, the Fully Quallified Doman Name is based on the human name. Hey, that’s a lot like this blog.

John Levine runs his own DNS servers, because he’s exceptionally connected and knowledgeable. He can provide an IP addresses for his lame content farm’s many, many, many domain names.

The note on the content farm’s web pages seems to indicate that they are generated by “IECC ChurnWare 0.3”.