Recent Topics

Proposal for the reduction of trackback spam

Started by on Jan 18, 2006 – Contents updated: Jan 18, 2006

Jan 18, 2006 18:38    

I was just sitting here and doing some thinking. Right now I have a renamed HTSRV directory which has practically eliminated trackback spam. Sooner or later the trackback spammers are going to get as smart and annoying as the comment spammers (especially as comment spamming becomes more difficult). My suggestion is to entirely rethink how trackbacks work. This is something that b2evolution could take the lead on as I don't think anyone else has done this yet.

I've been thinking about how we could use captchas or other antispam techniques on trackbacks. Instead of having one trackback url for each post that's listed on the site how about dynamically generating trackback urls as requested by users. Don't list the trackback url on the post page, instead give a form for requesting a private trackback url. This would give us a place to use a captcha to verify a real user. Each user would get a unique and randomized trackback URL that would be valid for only one trackback. The temporary URLs and which article they are associated with could be stored in a new database table and a function could snag incoming trackbacks and post them to the appropriate article.

This would have to be a core feature change because it means multiple trackbacks per article and a database schema change. I know that a lot of people think the blacklist is the end-all be-all solution to spam and it helps a lot but it can't solve everything. The HTSRV rename solution is temporary at best and this plan allows for a lot of future flexibility including the possible future ability of tracking how many trackback urls are request by a given IP in a given amount of time and rate limiting that. The idea of having temporary, user specific trackback URLs is a much more permanent solution to the trackback spam problem than what I have seen in the past. The current focus on reducing comment spam will certainly have the long term effect of pushing more spammers to trackback spamming and if b2evolution has a flexible framework in place for dealing with that threat we'll be much better position to meet the challenge.

Jan 19, 2006 02:50

BenFranske wrote:

Instead of having one trackback url for each post that's listed on the site how about dynamically generating trackback urls as requested by users. Don't list the trackback url on the post page, instead give a form for requesting a private trackback url. This would give us a place to use a captcha to verify a real user. Each user would get a unique and randomized trackback URL that would be valid for only one trackback. The temporary URLs and which article they are associated with could be stored in a new database table and a function could snag incoming trackbacks and post them to the appropriate article.

Not a unique thought, its been done : see my blog... http://www.village-idiot.org/archives/2006/01/17/the-duke-of-cleveland/

user-generated trackback with a unique and expiring key, and i should tell you that I get NO trackback spam. I do see ocassional spam comments that get nuked immediately (i get the email, by choice) But absolutely NO tb spam.

This wouldnt be a difficult thing to code for b2evolution, and its even simpler than you suggested.

Jan 19, 2006 02:54

Darn, thought I might have had a new idea. I guess I would try to do it without the javascript requirement, without the textarea and with a shorter key but I am glad to see that the idea is workable.

Jan 19, 2006 03:14

well, for what its worth, the js and "click a link" requirement are 2 more assurances that its actually a human. Its unbeatable, I assure you :)

It is possible to use the plain old tb url, but guess what, they go straight to /dev/null :lol:

in fact, just requiring the user to take that one extra step of clicking a link to get the correct url is more than what any blogapp has right now.

Jan 19, 2006 03:37

The click a link/button (easily changed to submit a captcha) is great, I'm not crazy about the JS requirement because a lot of savvy users run with JS turned off for security purposes.

Jan 19, 2006 03:57

I will see if I can put together something similar for b2evo this weekend -- no promises though, and truthfully, the weekend after this next one is better (I have a three day weekend, friday the 27th is my bday)

however I have to comment on something ..

requiring a user to click on a captcha to get a trackback url is not exactly friendly when it comes to accessability.

On the other hand, TELLING a user that they need to enable javascript in order to use the form/click/whatever to get a link is "nicer" imo. In nearly every (non) text-only browser its only a cpl more clicks to enable js, reload the page, and voila, click the link -- theres the tb url.

Purely internet-philosophy/webmaster choice off-topic talk below here:

Despite the fact that there are plenty of malicious things on the 'net, I do not surf with java applets or script disabled. That isnt to say that there are not those that do. But honestly, those people watch too much CNN if you ask me :P

My own site has some fairly large images on it, as well as a good deal of flash, and personally, I expect that people surfing my site are not using a crippled browser to view it, and in some respects I require that they are not using a crippled browser to enjoy it. (note that there are 2 diff things, viewing and enjoying)

Jan 19, 2006 04:11

Understanding the accessibility issues surrounding graphical captchas I have decided as a webmaster that they are the best method of further reducing spam and keeping my inbox from overflowing with comment spam. My preferred method of dealing with the accessibility issue is to have people who are unable to use the captcha directly email me their comment or trackback and I will manually add it to the site. I find this far less time consuming than going in and removing spam constantly but I also seem to get more spam sooner than the average b2 user.

Regarding JS/Flash/etc. I really dislike gratuitous use of these tools, they're fine when they are the best or only solution but most of the time that is not the case. Of course I completely despise all HTML email too which puts me in the wierd category. As my mother would say, "why can't I use all those pretty colors in my email?"

I guess that's enough off topic web philosophy though. I'd look into hacking this out myself but I'm in the middle of rewriting a captcha class that has a good image generation core but the rest is really bad. If you do get around to hacking something out I would appreciate it if you submitted a form (just make it a submit button) to get the custom trackback. That way people can very easily add a captcha if they so desire. If it's just a link it becomes a bit harder for the average user/code hacker.

Jan 19, 2006 04:22

Your request is noted Ben, Ill see what I can pull off.

---
ack! youre in Minnesota! Edina, even.. what IS with the traffic at Southdale????? :>

Jan 25, 2006 02:53

Any comment from an official developer?

Jan 25, 2006 21:07

I'm not quite sure about how to implement this without a significant change in how trackbacks are handled by the core. I'm thinking about a change in how trackbacks are handled and making it more similar to comments where they can be individually tracked and karma can be applied, etc. This would also require a database schema change to support a table for creating temporary trackback URLs and linking them to a specific post. Any ideas about how to go about this are appreciated!

Jan 26, 2006 01:42

A plugin could create it's own DB tables if needed. I've done this (locally) for the DNSBL plugin to save statistics.
Just take a look at that plugin, once I've committed it.

And for the rest: as far as I know trackbacks are handled very similar to comments currently. In fact, both are handled as feedback in the same DB table (T_comments).

We'll also add hooks to the trackback receiving part of htsrv and Plugins that do content filtering in general can just provide those events also to add/remove karma.

In this particular case you'd register an event to generate an time based URL/ID through Javascript (which would be easy with /htsrv/call_plugin.php which I'm also working on) and look for that in the dbinsert()-hook for the trackback.

Jan 26, 2006 04:35

Sounds like a bit too involved of a project for me at the time being, maybe if I can get some of the other projects cleared off my todo list I'll look into it further. My goal here is really to document the idea in case someone else wants to run with it before I get a chance to do so.

Apr 18, 2006 22:58

... a try to revive these excellent ideas in this thread here ...

Apr 19, 2006 16:36

Coincidentally I got hit with my first bout of meaningless trackback spam overnight. Between 3 and 9 AM someone hit me with about 130 trackbacks "this is a very good site". Luckily, they only used msn.com, yahoo.com and google.com as the URLs so I could do mass deletion but the problem will not be going away.

If someone is willing to write a plugin for tokenizing the trackback system as described above and allowing it to be extended by comment spam protection plugins such as the captcha plugin I would love to help test it.


Form is loading...

powered by b2evolution CMS – This forum is powered by b2evolution CMS, a complete engine for your website.