Recent Topics

1 Aug 22, 2008 22:55    

Hey,

here is my new plugin to protect you from spiders, robots which ignore your robots.txt, may search for mail adress to send you spam.

How it works

The widget adds a invisible link to your blogs source code. Human users don't see this link. Robots will follow this link because they are only searching for links in the source code.

If someone is following the link his ip adress will be saved. Every time this ip wants to access your blog the output of your blog will be aborted. A form with a little addition question will be displayed. Human users can solve this and will get access to the blog again but robots get stuck.

Features
[list]

  • Blocks automatically bad robots.

  • Robots stay blocked, humans can unblock thereself.

  • List of blocked ips (Go to: Tools -> Bottrap X)

  • [/list:u]

    Installation
    [list=1]

  • Download the plugin [url=https://sourceforge.net/projects/bottrapx-bottra]here[/url].

  • Upload the bottrap_x_plugin folder into the "plugins" folder.

  • Login to the back office.

  • Install (Go to: Settings -> Plug-ins -> "Available plugins", click the [Install] link).

  • Check the settings of the plugin if the blogs home is set correct.

  • Open the widgets menu of the blogs which should use the plugin (Go to: Blogs -> Widgets) and add the "Bottrap X" widget to your blogs header.

  • Open your robots.txt (if it doesn't exist, create it) and add User-agent: *
  • Disallow: /blogs/?crawler=yes and save the file.]

    Feedback would be nice! Please remember: This is early alpha. I can not promiss it works really good.

    Tears

    UPDATE 24.08.08:
    - fixed a bug no robots are displayed in the list
    - cleaned up code
    - formfield to add IPs manually

    2 Aug 28, 2008 00:25

    New Version released:
    Version 0.8 / 2008-08-28
    - date and time of blocking is displayed in admin area
    - admins can block, unblock or permanently block ips (permanent blocked ips get no captcha to unblock themselfs)
    - ips stay in list, even if they unblock themselfes
    - only admins can delete an ip completly from the database (to clean up the list)

    [url=https://sourceforge.net/projects/bottrapx-bottra]Download[/url]

    It would be nice if someone could get me some feedback.

    What do you think about a central list of bad ips, like the central antispam blacklist where you can report the catched bots and get a list of bots others catched?

    Tears

    3 Aug 28, 2008 09:02

    The problem with blocking by IP indefinitely, even with a "free me" *shudders* captcha */shudders* is that not everyone has a static IP so you could be blocking innocent people that have been allocated an IP that was previously allocated to a spammer.

    The problem with a centralised blacklist is two fold :
    1) somebody has to maintain it and decide who gets on and who doesn't
    2) It's very simple to flood it with bogus reports

    Personally the way we work is we have a "ban period" that scales upwards, you spam us once you get banned for two hours, spam us again it's 4 hours, spam us again it's 8 hours .... all the way up to a 12 month ban.

    We also have a push system for our banning ( rather than a pull system ), and all of our "reporters" are trusted sources.

    If you *really* want to save system resources then you're better off running your checks before the whole of the core kicks in ( we spark off our spamhound in _basic_config.php ), that way you save a shedload more cpu/bandwidth than using the plugin system/hooks ... although we do use plugin hooks for laying a few nasty traps for stupid bots ;)

    ¥

    4 Aug 28, 2008 14:00

    Thanks for the feedback. I checked your blog but coundn't find any information about how your "spamhound" works - is this a plugin and where can we get it? :D

    I don't see any problem with dynamic IPs. Who big is the chance to get anotherone with the same IP on your blog again? Although I believe must of the "bad bots" use IPs from countries like russia - not the typicall location for most of the normal users.

    You're right with the central blacklist. I thought about sending IPs after 7 or more days they stay blocked. In the central blocklist they are saved together with the reporters server ip. Only if an IP is reported more than 5 times from a different server it gets though the public blacklist. But I think this isn't worth the work ;)

    5 Aug 28, 2008 15:38

    Um... given that most cities in the US have ONE option for an ISP, most folk will end up on an IP address shared by someone who sucks.

    Blocking by IP is done but - to me - isn't a good idea based on a hit or two. 40 hits in half an hour is cause to block an IP for a few weeks, but an episode of alleged spamation is surely not cause to block an IP for an unknown amount of time.

    Even so, I look forward to installing this plugin and seeing if maybe it can help me with the occasional IP that throws up "40 in 30" ;)

    6 Aug 28, 2008 16:44

    Sry, my english sucks: Am I right you want the plugin to block an IP if there are to many hits in a short time? Well, I think I can handle this, but I don't think it's "more evil" to make much hits than to follow a hidden link.

    Although human users are able to unblock themselfs, even if someone else with the same IP got trapped. But I'm thinking about a better way to find out bots - like the time they take to insert a comment etc.

    7 Aug 28, 2008 17:39

    I'm afraid our hound isn't publicly available, it's about the only thing that isn't ... mainly because it's a tad more effort than "upload this and click install"

    I agree that the chances of a spammer-ip ++ innocent person being allocated that ip and visiting your blog is slim, but it's already happened to us on more than one occasion ;)

    Easier than timing the delay between "post called" and "comment sent" is to have a fake comment form, bot's are stupid when it comes to it ;)

    I'm not even gonna begin to mention my thoughts on captcha in any way shape or form ... ( [url=http://waffleson.co.uk/2008/06/do-you-captcha-your-target-audience]Do you captcha your target audience?[/url] ) :roll:

    ¥

    8 Aug 28, 2008 18:09

    The comment or trackback spam isn't the only problem. For example, here in Germany you need an imprint on your website, so you have to publicate your email adress, phone number, adress...

    My problems are email crawlers and search engines which ignore robots.txt. I was very upset when I found my website on gigablast.com, searching for my name and town. Any idea how to block this bots?

    9 Aug 29, 2008 11:56

    You could move to france? :roll:

    If you *have* to put your email and address on there then about the only way you can protect it ( without requiring javascript ) is the way that you're doing now.

    Either that or convert it into an image and display the image ... but that'd fail for people with text browsers

    Although, I suppose you could protect your imprint page with htaccess ( you'd have to tell your visitors to type "user" / "passowrd" in when the dialogue box came up ), not exactly user friendly but it'd pretty much kill the bots

    ¥

    10 Jan 03, 2010 12:59

    I see the problems with a centralized blacklist, but it seems like a decent idea. At least it caught a couple of bots for me and hasn't caused any problems.


    Form is loading...