Recent Topics

Request to delete some blacklisted keywords

Started by on Feb 14, 2009 – Contents updated: Feb 14, 2009

Feb 14, 2009 03:00    

The following keywords should be removed from the b2evo antispam list

mail.ru - http://en.wikipedia.org/wiki/Mail.ru
rambler.ru - http://en.wikipedia.org/wiki/Rambler_(portal)

It's not a good idea to ban the whole domain (even if it's .RU), why don't you ban yahoo.com, hotmail.com or aol.com?

Feb 14, 2009 03:06

I haven't played in the antispam space in a long long time, but I can tell you this: EVERY keyword is driven by user submissions. We used to ban specific blogspot domains until they "took over" the antispam list. It was decided to ban ".blogspot.com" so that users didn't have to have over 5000 entries.

I'm supposed to go back into that space but holy cow time-sensitive projects have me buried. If/when I do it'll be to turn it over to someone else, and they will (hopefully) follow the same criteria: if enough users report a domain frequently enough in a short enough time span it gets added. Eventually it'll get removed, but that took no one reporting it anymore.

Feb 14, 2009 03:51

If you are admin you can ban any domain or keyword in 2 clicks, you can ban google.com or you can even ban all .ru domains, who cares until it's a local b2evo install only.
But such keywords shouldn't be included in global list.

Feb 14, 2009 04:14

Sorry friend, but that is not how the master list works/worked. Very simple: enough users in a short enough time report something as spam = it gets listed.

Sometimes exceptions have been made. This could be the fourth eh?

I'm trying to get access again. Stupid FileZilla exports hashed (or otherwise scrambled) passwords, which are useless for after-the-fact. So I'm waiting for FTP access again before I'll wade in that muck.

BTW you can't ban all .ru domains due to a 5 character minimum.

Anyway I'll be having a poke around in there to see what I can see regarding these two you've identified.

And NO I will NOT waste my time with every freaking keyword anyone has an issue with!!! REMOVE IT FROM YOUR LOCAL TABLE!!! The master list is based entirely on reports by users. Not opinions. Not beliefs. Not concepts. Reports by users, reviewed by humans against a set of rules. Nothing else counts.

Feb 14, 2009 04:23

okay mail.ru will be removed and others will take it's place. Most notably, .hotmail.ru is pure spam ACCORDING TO THE USERS WHO REPORT.

mail.ru is not in accordance with what we know about keywords. Specifically, it is a bit on the shorter side and doesn't contain two non-letter characters.

Again if anyone out there thinks this is an open invitation to have the master list tweaked to their liking: you're wrong.

Feb 14, 2009 05:30

The master list is based entirely on reports by users

And this is sad :(

Let's say 70% of b2evo users are English-speaking and for the keyword mail.ru they get 9 spam hits of 10. Of course they will report it and the word will be blacklisted.
But 5% of Russian-speaking users may get 1 spammer out of 40 hits for the same keyword.

The majority rules...

Anyway, thank you for fixing it ( I hope you will ;) )

Feb 14, 2009 06:45

I'll play with rambler.ru later, but no it is not sad. The majority OF USERS is who matters. No offense there of course.

Oh and the fact that a bunch of simple-minded idiots have reported variations of "youtube.com" doesn't mean it turns into a keyword.

We, which used to be more than one person but turned into me and for 8 months has been no one, LOOK at stuff before we ban it. WAY back in the days of 0.* nobody looked or cared. Admins with powers would get a spam hit and instantly ban it, even though it never got another report. Heck someone way back then decided if someone accidentally reported their own domain name they would publish it. ha ha right? I spent DAYS trying to undo that damage. "mail.ru" has been on the list since 05. That goes back to the wild wild days of nobody thinking about what they were doing. Before we learned simple stuff like "cialis" blocks "specialist".

So should the majority of users rule, or should the majority of users suffer butt loads of spam because somewhere someone thinks it isn't a spammy source? And if .ru domains are spamming english websites would the fact that they are not spamming russian websites mean they are not spammers? Wow I recall someone trying to tell me I must NEVER add a french domain to the list because France has laws against spamming. Uh... yeah. Right.

Anyway I gave up because I got sick of it. Someone else is willing to pick up this absolutely thankless job, so I'm back in trying to get a good handoff to the new player. As part of that I will absolutely NOT imply in any way that antispam can be bargained with.

1. Lots of reports from users means lots of users are suffering.
2. The admin/admins will look to see what the keyword they are banning will block.
3. The admin/admins will seek the most effective blocking keyword that causes the minimum collateral damage.

Pretty simple eh? Right now there are over 167000 posts waiting to be reviewed. One Hundred Sixty Seven Thousand. Some are drafts that should be deprecated because an existing keyword blocks them. Some are drafts that will never be published because they're old enough to be considered non-problems. Some are deprecated posts that have gone long enough without a new report as to be deletable. Some are published keywords that need to be considered for deletion when the database no longer receive reports that they would have blocked. When I last played I was deleting roughly 1000 posts a month from the database.

And every now and then *STILL* we have keywords that don't meet the rules we learned to live by. Like mail.ru for example.

It is not simple and it is not subject to negotiation.

The serious players in here, like you sam2kb, get more respect than a door in their face. Everyone else can wade through 10000 posts for me before telling me "this shouldn't be" ;)


Form is loading...

Content Mangement System – This forum is powered by b2evolution CMS, a complete engine for your website.