The programmer of Cursebird, @richardhenry, has released a preliminary leaderboard (That would have saved me a lot of time three weeks ago!). One interesting thing I see on the list is that @bollocks gets every one of his tweets counted because of his name. He doesn’t even have to bother cussing.
Most people have wanted to see a leaderboard, but they haven’t necessarily thought through what that means. As I was working with @ThinkgingStiff, I thought a lot about the problems of creating one. Most people want bots excluded, and that seems easy at first. Bots like @ThinkingStiff and @Fuckbot are obviously there just to get to the top. But since Twitter doesn’t forbid bots, it leaves the job of excluding them up to the programmer. A first pass cleaning of them would be easy enough, but they will keep coming, I imagine, and it will become a daily task.
You also have bots that are reposting articles, scanning websites for keywords, or aggregating data. These are useful outside of the Cursebird universe and many of them are interesting and have many followers. Should they be excluded? After that, would come real people that are cussing their hearts out just to make it to the top. They have real accounts, real friends, and are really cussing. As soon as you have a leaderboard, people are going to compete to get to the top. That is just human nature. So pretty soon, you can be sure, the leaderboard will be all people that are trying to be there. It’s not very easy to determine who is just trying to be on the list and who just cusses a lot.
And once Cusrsebird has a list, with rules about who can be on it, there will inevitably be “good citizens” who want to rid the world of wrongdoers who will be scanning everyone’s tweets looking for “fakes.” They will then email the programmer to complain. Trying to keep the leaderboard clean would not be a job I would want to manually tackle on a daily basis.
Another problem I thought of is the effects of a leaderboard on website performance. Once there is a leaderboard, everyone will be clicking on these people to see who they are. Currently, it displays every swear a tweeter has ever said. For someone like @ThinkingStiff, at over 7000 swears, it takes a long time for the page to load. Only displaying the previous 500 might help with that.
It seems like all the manual solutions suck. I thought about a few automated ways to make bots less likely, but all of them depend on processing time on the database, and I don’t know how much time Cursebird has free in a day.
1. Limit the total daily swears to something more human, say 50. Twitter’s limit is 1000, of which about 500 will get counted (explanation). Only counting the first 50 swears would make bots much less useful.
2. Require an account to have at least one tweet over a month old. When someone starts to write a bot, they usually use a new account. If they have to wait a month to see any results, they will probably lose interest.
3. Require an account to have at least 10 followers. Very few people want to follow a bot. They may start, but will quickly stop following them.
4. Exclude duplicate swears. This is a big one and wold eliminate most common bots. A bot is either random words, which people quickly discover, or a finite list of tweets that get repeated.
Suggestion #1 could be done real time or as a batch on everyone. Suggestion #2, #3 and #4 would only need to be done on maybe 100 of the top cussers when building the leaderboard. Of course there are ways around #3, and #4 as a bot writer, but #1 is a big roadblock.
I’ll admit that a bot-free list would be interesting. Tweeters like @mollena are amazing. She just really cusses (and tweets) all the time.