[Finished: 31-MAR-2012] Calling all mySQL / PHP programmers to help with spam control

63 posts / 0 new
Last post

Pages

sb56637
sb56637's picture
Offline
Last seen: 1 hour 32 min ago
Joined: 01/08/2010 - 09:29
Posts: 7230
Location: The Light
[Finished: 31-MAR-2012] Calling all mySQL / PHP programmers to help with spam control

Update: 31-MAR-2012

Done!

I don't really think most of our legitimate users will want to make more than 1 post per minute. But if it gets in your way please let me know and I can set up an automatic bypass of the limit for accounts that have been active for more than a week. This will require a bit of time to set up though.

Thanks again to everyone for their support! Have fun.

 


 

Update: 30-MAR-2012

Hi everyone,

A quick update to this issue: I finally decided to pay a Drupal developer to write a quick custom module to limit posts per minute. I'm convinced that this will drastically slow down the spammers, the worst of which post anywhere from 5 to 20 posts per minute. With a 1 post/minute limit in place, they will create much less havoc, and if we all remain vigilant to mark spam posts, they will be blocked in short order.

I now just need to resolve the issue where the spam post gets unpublished but the post remains on top of the lists. I think I might have a workaround.

Thanks to everyone for their patience!
Have fun.

 


 

Hi guys,

As you know we are still experiencing some problems with spammers. The new system that we have in place is definitely working as intended and it is preventing the spam problem from being much worse here. But we do have some determined spammers that hit the site harder and create more of a mess before they get blocked.

Sorry I haven't yet replied to the threads and complaints by worried users about spam on BLF. During the past week I have yet again been heavily researching the available options for spam control on Drupal and thinking about their viability on BLF. Here are my conclusions:

  1. Automated 3rd party anti-spam services like Mollom or Akismet are absolutely out the question. I use Mollom on another website that I maintain, and I am very disappointed in its performance, mainly because of the false positives. Another site that I frequently visit uses Akismet, and virtually all of their users despise it because it blocks their legitimate comments. And the rate of false positives would be even higher here on BLF since our legitimate users post so many links. These services are expensive and overrated and they annoy legitimate users even more than the spammers.
  2. CAPTCHAs are a good first line of defense, but they can't be relied on as the primary method and they must be used sparingly to avoid penalizing legitimate users. Most spam attacks are a combination of humans and robots. So a human solves the CAPTCHA and creates the user account, and then sets the robot to post from then on. Obviously we could prevent that by putting a CAPTCHA on every single page for all users before all posts, but that would be unfair to all of our legitimate users. Even if we were to do this only to users that have not yet proven their reputation, it would still be unfair to penalize new users, most of which are legitimate.
  3. IP blocking is utterly worthless. Spammers use proxies, so they simply switch to a different proxy and avoid the ban.
  4. Adding more moderators isn't a solution. We can't expect them to be awake or not travel on vacation. The spammers are bound to come when the moderators are off guard. As it is, all of our users around the world are moderators, so they're much more likely to be able to collectively stop a spam attack.

In view of all this, I want to keep using the same basic system that we have in place. It is working very well, but with one exception: When the spammers set a robot on the site, it goes through all the existing threads and posts the spam comment in different nodes (threads) at the rate of almost 20 per minute. If it were to post in just one node (thread), then an existing mechanism would kick in to prevent repeated posts. But unfortunately after extensive research and testing, I've found that there is no currently available functionality in Drupal to limit the rate of posts by any given user across the entire site. I'm actually really surprised that this functionality doesn't exist, it seems like such a fundamental necessity for reducing spam. If we could simply limit the number of posts by any given user to, lets say, 2 per minute, then spammers would manage a maximum of 5 or 6 posts and then get blocked.

So here's my request: Do we have any users here who are skilled mySQL / PHP programmers, or better yet with experience in Drupal? I have a basic understanding of the principles of databases and programming, but I'm a terrible coder and I need some help to create a custom module for BLF. The module would simply be a few lines of code that would query the database every time a user tries to post and get the last time he posted a new comment or new thread, and make the form POST fail if his last post was less than 30 seconds ago. I have some example modules that I can show you as a basic skeleton for creating this module, and I have some basic instructions from a Drupal expert for the SQL query.

Additionally, as a side request, I need a workaround to the current function that un-publishes posts that have been marked as SPAM. For some reason, the module that un-publishes posts that have been marked too many times as spam apparently uses a non-standard, non-Drupal hack to mark the post in the DB as un-published, which is why the "Recent Posts" list still shows those threads on the top even when the post is un-published. Again, this is just a one-liner that I can simply replace in the admin interface where I have the un-publish rules defined.

Any takers? Thanks very much in advance for your patience and willingness to help!

Budget Light Forum ...where Frugal meets with Flashlight!

Edited by: sb56637 on 03/30/2012 - 23:19
ruffles
ruffles's picture
Offline
Last seen: 1 month 1 week ago
Joined: 07/09/2011 - 10:18
Posts: 1020
Location: California

I don't have coding skills, but I do have a willingness to donate a few bucks to offset the cost of development. I suspect that others would feel the same.


 

sb56637
sb56637's picture
Offline
Last seen: 1 hour 32 min ago
Joined: 01/08/2010 - 09:29
Posts: 7230
Location: The Light

ruffles wrote:

I don't have coding skills, but I do have a willingness to donate a few bucks to offset the cost of development. I suspect that others would feel the same.

Thanks ruffles! It's actually an incredibly simple function that can be accomplished in just a few lines of code, so no major development time should be required. The only problem is that the Admin on this forum is incredibly bad at coding... But it's no hill for a climber. Wink

Budget Light Forum ...where Frugal meets with Flashlight!

hank
hank's picture
Offline
Last seen: 2 months 2 weeks ago
Joined: 09/04/2011 - 21:52
Posts: 9638
Location: Berkeley, California

> a CAPTCHA on every single page

Some annoyance and wasted time -- but until you get the programming you hope for, a CAPTCHA would be better than what's happening now.

Mainly -- it'd increase the cost to the spam company, maybe  enough they'd quit hammering this site.

This site gets hammered because to them, it's wide open to dump crap in, and being cleaned up often so their new crap is kept fresh and floats on top, seen by everyone every time.

 

Rezolution
Rezolution's picture
Offline
Last seen: 1 year 7 months ago
Joined: 10/04/2011 - 13:57
Posts: 545
Location: Pennsylvania

Can you make a feature request to the creator of whatever forum software you're using?  It sounds like a great idea and they should be able to implement it rather easily since they're the creator of the code.   Maybe you could get in touch with them and see if they could add this "limit amount of posts - per user - per minute" feature.

Langcjl
Langcjl's picture
Offline
Last seen: 4 years 5 months ago
Joined: 03/05/2011 - 05:36
Posts: 2162
Location: Wisconsin USA

Good Job on all of this Mr.Admin but really I wouldn't mind the MATH captcha on all posts. It only takes a second, no big deal.

Piers said " ....but who wants enough light, when you have the option for far too much "

sb56637
sb56637's picture
Offline
Last seen: 1 hour 32 min ago
Joined: 01/08/2010 - 09:29
Posts: 7230
Location: The Light

Rezolution wrote:

Can you make a feature request to the creator of whatever forum software you're using?  It sounds like a great idea and they should be able to implement it rather easily since they're the creator of the code.   Maybe you could get in touch with them and see if they could add this "limit amount of posts - per user - per minute" feature.

I wish I could, but Drupal is a huge project with hundreds of coders who are pretty much uninterested in the needs of individual users.

Budget Light Forum ...where Frugal meets with Flashlight!

sb56637
sb56637's picture
Offline
Last seen: 1 hour 32 min ago
Joined: 01/08/2010 - 09:29
Posts: 7230
Location: The Light

hank wrote:
This site gets hammered because to them, it's wide open to dump crap in, and being cleaned up often so their new crap is kept fresh and floats on top, seen by everyone every time.

Hmm, yes and no. The spammers from some flashlight company that want our users to buy their products do want their posts on top so that we click them. But the sort of spammer that posts 50 or 1000 posts at once is doing it to skew the Google search rankings so that when people search for "Coach bags" their site will come out on top because it has more links to it. They don't really expect actual users to click on their links, they just want the Google crawlers to see them.

Budget Light Forum ...where Frugal meets with Flashlight!

Rezolution
Rezolution's picture
Offline
Last seen: 1 year 7 months ago
Joined: 10/04/2011 - 13:57
Posts: 545
Location: Pennsylvania

Isn't this what you want to do?  Perhaps you just need to write a "rule"?

http://drupal.org/project/node_limitnumber

-----------------------------------------------------------------------------

Limit the amount of nodes or comments your users create over a given time period. This module has been rewritten to integrate with Rules. Instead of going to a page to assign limits you now just need to create rules. A default rule has been provided as an example.

When creating your rules there are now many hundreds of ways to implement your limits. Limits can be applied to roles, users, dates, or anything that can be accessed using PHP.

-----------------------------------------------------------------------------

I don't know much about this, just trying to help, but it looks like what you're trying to implement.

 

Rezolution
Rezolution's picture
Offline
Last seen: 1 year 7 months ago
Joined: 10/04/2011 - 13:57
Posts: 545
Location: Pennsylvania

Also, i dont know what version you're using...

---------------------------------------------------

Drupal 7

No plans are in place for porting this to Drupal 7. If you need a module like this for Drupal 7 you can try out Node Limit.

-----------------------------------------------------

 

 

http://drupal.org/project/node_limit

 

The Node Limit module allows administrators to restrict the number of nodes of a specific type that roles or users may create. For example, if a site has an "Advertiser" role that can create "advertisement" nodes, then the node limit administrator can restrict all users in that role to a specific number of nodes. He may also restrict users on a per-user basis.

Although other node limitation modules exist (such as create quota (Abandonned), user quota (D6), and node limitnumber(D6)), Node Limit offers features not available in all of those modules, such as:

  • Per-role node limits
  • Per-user node limits
  • Per-organic group node limits (Dropped)
  • Per-time interval node limits
  • Per-time frame node limits (Dropped)
  • Per-taxonomy term node limits (Coming soon)
  • Any combination of the above
  • Drupal 6 & 7 compatibility
  • Requires no programming on the part of the administrator
sb56637
sb56637's picture
Offline
Last seen: 1 hour 32 min ago
Joined: 01/08/2010 - 09:29
Posts: 7230
Location: The Light

Rezolution wrote:

Isn't this what you want to do?  Perhaps you just need to write a "rule"?

http://drupal.org/project/node_limitnumber

-----------------------------------------------------------------------------

Limit the amount of nodes or comments your users create over a given time period. This module has been rewritten to integrate with Rules. Instead of going to a page to assign limits you now just need to create rules. A default rule has been provided as an example.

When creating your rules there are now many hundreds of ways to implement your limits. Limits can be applied to roles, users, dates, or anything that can be accessed using PHP.

-----------------------------------------------------------------------------

I don't know much about this, just trying to help, but it looks like what you're trying to implement.

 

Wow, I'm impressed you found that one. Smile I looked for days and finally stumbled across it. But unfortunately it doesn't work. The problem is that it will accept the post but then un-publish it, so if legitimate users post 3 posts in one minute (which some do) their post will be accepted but then immediately un-published. What we need is a method where it checks before accepting the post so that fails at the POST step and the user can simply wait a bit longer and hit SAVE again.

Budget Light Forum ...where Frugal meets with Flashlight!

Rezolution
Rezolution's picture
Offline
Last seen: 1 year 7 months ago
Joined: 10/04/2011 - 13:57
Posts: 545
Location: Pennsylvania

Like i said, im not an expert on it in any way, but it may be possible to set up some type of rules based system.  There has to be a way to set it up for newer users, or not have it get applied to users with x amount of posts or something. 

I'm very sorry they aren't being more helpful with you.  Perhaps you could speak with the designers of the plugin module?


I won't add any more comments to this thread because I really have no experience in the matter and I'm not sure I'm helping Smile

sb56637
sb56637's picture
Offline
Last seen: 1 hour 32 min ago
Joined: 01/08/2010 - 09:29
Posts: 7230
Location: The Light

Rezolution wrote:
There has to be a way to set it up for newer users, or not have it get applied to users with x amount of posts or something.

You're right, this part is possible. I have thought about doing that, but I don't really want to invest time if we can quickly implement a superior solution.

Thanks a lot for your comments, I do appreciate them.

Budget Light Forum ...where Frugal meets with Flashlight!

newbie74
Offline
Last seen: 2 weeks 3 days ago
Joined: 10/30/2010 - 11:16
Posts: 138

Mr Admin,

i do believe you want a trigger. Do you know the table the posts get saved to? 

Could you please run the command: 

show create table tlb_posts and PM me the result? I could try to write a trigger that would do just that - fail the insert command. 

edit - You must replace tbl_posts with the correct table name in the command above. 

 

sb56637
sb56637's picture
Offline
Last seen: 1 hour 32 min ago
Joined: 01/08/2010 - 09:29
Posts: 7230
Location: The Light

newbie74 wrote:

Mr Admin,

i do believe you want a trigger. Do you know the table the posts get saved to? 

Could you please run the command: 

show create table tlb_posts and PM me the result? I could try to write a trigger that would do just that - fail the insert command. 

edit - You must replace tbl_posts with the correct table name in the command above. 

Thanks very much! With this method would it be possible to return the user back to the post editor with a "Please try again in 30 seconds" error message and preserve the form contents?

Budget Light Forum ...where Frugal meets with Flashlight!

brted
brted's picture
Offline
Last seen: 2 weeks 6 days ago
Joined: 01/12/2010 - 19:44
Posts: 2371
Location: Atlanta

Have you tried question and answer based CAPTCHAs? The blurry text can be hacked with OCR and the Math is pretty easy for computers. Q and A can be pretty effective because even if humans set up an account they generally might not understand English.

sb56637
sb56637's picture
Offline
Last seen: 1 hour 32 min ago
Joined: 01/08/2010 - 09:29
Posts: 7230
Location: The Light

brted wrote:

Have you tried question and answer based CAPTCHAs? The blurry text can be hacked with OCR and the Math is pretty easy for computers. Q and A can be pretty effective because even if humans set up an account they generally might not understand English.

Hmm, not a bad idea. I think I might try that.

Budget Light Forum ...where Frugal meets with Flashlight!

administrator
Offline
Last seen: 4 weeks 1 day ago
Joined: 01/10/2010 - 01:29
Posts: 25

Ok, here's a test of the new CAPTCHA.

Don
Don's picture
Offline
Last seen: 1 year 11 months ago
Joined: 01/12/2010 - 16:32
Posts: 6617
Location: Scotland

I already like the new captcha!

 

The numbers from my light tests are always to be found here.

https://spreadsheets.google.com/ccc?key=0ApkFM37n_QnRdDU5MDNzOURjYllmZHI...

sb56637
sb56637's picture
Offline
Last seen: 1 hour 32 min ago
Joined: 01/08/2010 - 09:29
Posts: 7230
Location: The Light

Ratpie wrote:

What you could try is to run the site through cloudflare ( https://www.cloudflare.com/overview ).

It's basically a large CDN that recognizes and learns threats based on actions done against cloudflare users.

Thanks for the link! I am considering using Bad Behavior, which doesn't analyze post content but rather post method and post origin. But it's only a secondary line of defense. The most important will still be our community moderated anti-spam system.

Budget Light Forum ...where Frugal meets with Flashlight!

keltex78
keltex78's picture
Offline
Last seen: 2 years 3 months ago
Joined: 03/18/2011 - 10:15
Posts: 3705
Location: Texas

Works... An extra step, but better than Spam...

Should help slow them down at least...


Keepin’ the “B” in BLF

Don wrote:
It sounds like the XM LEDs won’t really be suitable for flashlight use. Pity…

newbie74
Offline
Last seen: 2 weeks 3 days ago
Joined: 10/30/2010 - 11:16
Posts: 138

With the trigger in place the database would return the insert error to the forum.

I could not control the message displayed by the forum. You could even get an "internal error", depending on how Drupal handles DB errors (and I'm not an expert in Drupal) 

sb56637
sb56637's picture
Offline
Last seen: 1 hour 32 min ago
Joined: 01/08/2010 - 09:29
Posts: 7230
Location: The Light

newbie74 wrote:

With the trigger in place the database would return the insert error to the forum.

I could not control the message displayed by the forum. You could even get an "internal error", depending on how Drupal handles DB errors (and I'm not an expert in Drupal) 

Hmmm, any other suggestions to work around that with a bit of PHP code?

Thanks a ton for looking into this.

Budget Light Forum ...where Frugal meets with Flashlight!

garrybunk
garrybunk's picture
Offline
Last seen: 1 week 1 day ago
Joined: 10/31/2011 - 09:25
Posts: 6100
Location: Johnstown, PA

I'm looking at one of the new CAPTCHA's - "Do you hate spam?"  What if a user does like spam?  What if a user (ie. newbie) thinks "spam" is referring to the food?  Also, I guess uppercase/lowercase doesn't matter on the answers, right?

Great work SB! 

-Garry

My Bike Lights Thread, Optics (TIR) Comparison Beamshots, Diffusion Techniques

, MTBR’s Lights & Night Riding Forum
NOTE: Now hosting my photos from my Google account. Post up if you can’t see them. Older photos hosted on Photobucket or Flickr may disappear (PM me if you want access to them).
newbie74
Offline
Last seen: 2 weeks 3 days ago
Joined: 10/30/2010 - 11:16
Posts: 138

I can do it with PHP but unfortuantely I know next to nothing about Drupal.

The problem is getting the right event to trigger the function that will, for example, return true if the user has posted in the last time interval. Then you could try to display an error yourself - but I cannot help you there.

Back to my original solution, I don't see a big problem with displaying "internal error" messages, especially if all human users are aware of the motive. If someone hits F5 and the time interval is set to 30 seconds it will probably refresh with the comment correctly posted. And it will probably cause a robot to give up it's attack.

Anyway, the trigger could be set as a last line of defense - say it only block posts 10 seconds apart. The good news is that triggers are usually pretty fast. Bad news is there will be an extra select on a large table for every comment posted.
 
Feel free to contact me if the new captcha fails...
brted
brted's picture
Offline
Last seen: 2 weeks 6 days ago
Joined: 01/12/2010 - 19:44
Posts: 2371
Location: Atlanta

I just wanted to see if I could answer the spam question

Volk
Volk's picture
Offline
Last seen: 1 year 7 months ago
Joined: 03/07/2011 - 20:21
Posts: 265
Location: Sweden

I mentioned this in another thread: would it be possible to set the need of "solving" the CAPTCHA to every 10th (or a random number) post instead of one at the beginning of each session? By doing so the robots might be halted in their spamming.

If it's easy to do it could be a quick fix until your plan works out.

brted
brted's picture
Offline
Last seen: 2 weeks 6 days ago
Joined: 01/12/2010 - 19:44
Posts: 2371
Location: Atlanta

I like the easy question CAPTCHA, though not for every post. The Captcha Riddler module lets you come up with a few different questions to ask and have them rotate. Other easy questions:

Write the word blank in the box below.

In BLF's motto at the top of the page, what meets with flashlight? (variations for meets and flashlight)

Don
Don's picture
Offline
Last seen: 1 year 11 months ago
Joined: 01/12/2010 - 16:32
Posts: 6617
Location: Scotland

I've only seen two - the spam one (seen once) and the "b" one (seen every time apart from once).

 

I'd think you need more though maybe you don't. I have no problem with filling the box for every post though.

 

I've always liked the picture ones where you have to choose which one isn't a kitten or I suppose for here, "Which of these is not a flashlight"

 

But what I know about implementing such stuff could easily be printed out in a large font and inserted in my eye with no discomfort....

 

The numbers from my light tests are always to be found here.

https://spreadsheets.google.com/ccc?key=0ApkFM37n_QnRdDU5MDNzOURjYllmZHI...

astroteckid7
astroteckid7's picture
Offline
Last seen: 10 years 5 months ago
Joined: 02/21/2012 - 16:44
Posts: 5
Location: CN84

one of the things you can do as well is to moderate the new user for an "X" number of posts, allowing them to prove themselves... so captcha's checking their posts before posting...  spammers will even find ways around those ...  FnF has a unique way... they have to send an email to the owner... then he makes the call whether it is a bot or a human. 

my two cents

astroteckid7

brted
brted's picture
Offline
Last seen: 2 weeks 6 days ago
Joined: 01/12/2010 - 19:44
Posts: 2371
Location: Atlanta

No attacks so far tonight . . . fingers crossed.

Pages