How to block direct access to blog?

« Fed up of Comment and Trackback Spam :: URL: URL not allowed?? »
Author Message
elpie
Posted: Thu Jun 23, 2005 07:26     Topic subject: How to block direct access to blog?

Hi all,
I need help with two things please.

Visitors to my site can view my blog from within a wrapper (iframe) - I am using Mambo and want the blog to "appear" as integrated as possible.
(Though I am hoping it won't be long before someone comes up with a way to truly integrate the two!)

So, I want people to be able to see the blog from the wrapper, but not to be able to view it by going to domain/blog/.

I guess its probably an .htaccess thing, but I have tried to get an allow,deny working with no luck.

In blocking direct access though, I don't want to block my own access to logging on.

Any ideas? :?:

Back to top
 
personman
Posted: Thu Jun 23, 2005 15:06

I'm no .htaccess expert, but here's an idea. You could write a rule that checks the referrer of every hit and if it matches your mambo site, then you would display the requested page as normal. If it doesn't, then you could redirect the hit to your mambo-based site. But I don't know anything about iframs and how they would affect this setup.

Back to top
 
elpie
Posted: Thu Jun 23, 2005 15:17

Thanks, but I don't want to redirect anyone, I just want the www.domain.com/blog/ to be inaccessible to everyone except for the call to it from the wrapper.
And except for my admin login of course ;)

The blog was getting hit so hard and fast by spammers that I decided to rename it and do the wrapper thing. But anyone who finds it can still access it directly. It means the spammers can find it again and any legitimate visitor gets a blog with no navigation or anything. Much better if I can hide that from view!

Back to top
 
personman
Posted: Thu Jun 23, 2005 15:33

That's exactly what I'm talking about. I was saying that if someone types www.domain.com/blog/ they just get redirected to your mambo site, but if you don't like that, then you could write a rule that just stops them from getting anything. Here's some code so you can see what I'm talking about. This isn't tested and I have no idea whether it will work for you.

Code

RewriteCond %{HTTP_REFERER} "!^http://(.*)yoursite.com/.*$" [NC]
RewriteCond %{REQUEST_URI} ".*/blog/$"
RewriteRule .* - [F]

The idea is that (line 1) if someone hits your site with an http referer that is not yoursite.com, and (line 2) they are asking for something in the blog directory, then (line 3), the url is rewritten to nothing and they get a blank page. Like I said, I don't know if this will work or not, but you can give it a try.

You may need to add a rule that excluded the /blogs/admin/ directory from this, unless you can access that through the iframe, too.

Back to top
 
ralphy
Posted: Thu Jun 23, 2005 21:54

That rule might prevent the search engines bots to check your blog's pages as well as it prevents people coming directly from any other page (including search engines if some succeed indexing your blog).

It would be more interesting to make a more clever rule, redirecting from direct access to the wrapper and letting the search engines bots to access directly to the blog.

Try something like:

Code

# Activate rewrite rules
RewriteEngine On
 
# Redirect direct accesses to the blog to the wrapper
RewriteCond %{HTTP_REFERER} !^http://.*yoursite.com/.*$ [NC]
RewriteCond %{REQUEST_URI} ^/blog/.* [NC]
RewriteCond %{HTTP_USER_AGENT} !googlebot [NC]
RewriteCond %{HTTP_USER_AGENT} !slurp [NC]
RewriteCond %{HTTP_USER_AGENT} !msnbot [NC]
RewriteCond %{HTTP_USER_AGENT} !googlebot [NC]
RewriteRule  ^(.*yoursite.com/blog/)(.*) http://full_url_of_your_wrapper$1 [R=302,L]

However, search engines dislike you display different pages to your users than to their robots... You might be banned from a search engine or have a penality for that.

More information about the robots user agents identification can be found on [url=http://www.robotstxt.org/wc/active.html]The Web Robots Database[/url].

Back to top
 
personman
Posted: Thu Jun 23, 2005 22:00

I was hoping that someone who knows mod_rewrite better would come along.

Back to top
 
elpie
Posted: Thu Jun 30, 2005 04:35

Ok, I have a working block now which gives a 404 error if someone tries to access the blog directly.

# Blocking direct access
RewriteCond %{HTTP_REFERER} !^http://www.mydomain.com/.*$ [NC]
RewriteCond %{REQUEST_URI} ^.*index\.php$
RewriteRule .* - [F]

It works in that nobody can access the blog directly, but I can still get full access from within my Mambo wrapper. Problem is, it shows the directory which is forbidden and I would rather have it so anyone coming to it doesn't get any clue that there is anything there to find.

Suggestions???

Back to top
 
capnrob
Posted: Fri Jul 8, 2005 02:44

I'm having the same issues within Mambo - but that htaccess doesn't work for me...

I'm trying to block access to the index.php as well...

I've been trying to solve this one for a year and no good solutions yet.

is there another .php file I can call in an I-Frame to display the blog?

Back to top
 
elpie
Posted: Fri Jul 8, 2005 03:06

I am no expert on .htaccess, but FWIW, here is my file, which works.
It is in the blog folder, not in the MOS htaccess.

# Apache configuration for the blog folder

# this will select the default blog template to be displayed
# if the URL is just .../blogs/
<IfModule mod_dir.c>
DirectoryIndex index.php
</IfModule>

# this will make register globals off in b2's directory
# just put a '#' sign before these three lines if you don't want that
<IfModule mod_php4.c>
php_flag register_globals off
</IfModule>

# this is used to make b2 produce links like http://example.com/archives/m/200209
# if you renamed the file 'archives' to another name, please change it here too
<Files archives>
ForceType application/x-httpd-php
</Files>

# Last updated 30th June 05

RewriteEngine On
RewriteBase /

# Blocking direct access
RewriteCond %{HTTP_REFERER} !^http://www.domain.com/.*$ [NC]
RewriteCond %{REQUEST_URI} ^.*index\.php$
RewriteRule .* - [F]

# Fix for comments
RewriteCond %{HTTP_REFERER} !^http://www.domain.com/.*$ [NC]
RewriteCond %{REQUEST_URI} ^.*comment_post\.php$
RewriteRule .* - [F]

# Bad referers and pinapple start here
# Get the pinapple proxy first
RewriteCond %{HTTP:VIA} ^.+pinappleproxy [NC,OR]

# Atrivo Technologies
#deny from 69.50.160.0/19

#Eli Net
#deny from 209.63.0.0/16

#Norwegian spammer
#deny from 62.128.224.0/19

deny from 216.32.82.50
deny from 80.77.86.213
deny from 66.230.190.5
deny from 64.124.222.176

# Bad TLDs not covered above
RewriteCond %{HTTP_REFERER} \.biz [NC,OR]
RewriteCond %{HTTP_REFERER} \.ru [NC,OR]

# Try to prevent referrer spam
RewriteCond %{HTTP_REFERER} "!^http://www.domain.com/.*$" [NC]
RewriteCond %{QUERY_STRING} "disp=stats"
#RewriteRule .* - [F]
RewriteRule ^(.*) http://%{REMOTE_ADDR}/ [R=301,L]

You will note that I have blocked all .biz and .ru domains - this is because I had a lot of spam coming from those two and no genuine visitors.
This seems to have worked for me because I have had no comment spam or referrer spam since (mind you, I got rid of referrers on my skin so that may have something to do with it too! )

Hope this helps!

Back to top
 
capnrob
Posted: Fri Jul 8, 2005 15:37

Well, after blocking everyone from the index - (including me) I tried another approach (I don't think I'm cut out for htaccess.)

I made a new file -say myindex.php and copied the contents of /blog/index.php to it.

I then put a simple redirct on the original index.php to my wrapper.
I linked mambo to the myindex.php file and so far its working fine.

then I disallowed myindex.php in robots.txt so the search engines wouldn't find it.

*** Update that idea was moronic for obvious reasons. - ugh. Maybe if I used a stub file to call the blog ?

Back to top
 
isaac
Posted: Fri Jul 8, 2005 15:59

How about putting this in your skin up in the head section? It's ugly, but it works.

Code

<script type="text/javascript">
<!--
window.onload=function() {
  if( this == top ) window.location.href='http://my.site.com/frame_wrapper.html';
}
// -->

If you put this in the head of the wrapper page, it can prevent duplicated frames (where the frameset holds another copy of the frameset)

Code

<script type="text/javascript">
<!--
window.onload=function() {
  if( this != top ) top.location.href=window.location.href;
}
// -->
</script>
Back to top
 
elpie
Posted: Sat Jul 9, 2005 01:48

CapnRob wrote:

Well, after blocking everyone from the index - (including me) I tried another approach (I don't think I'm cut out for htaccess.)

Sorry Rob, I thought I had explained what my .htaccess does. It does block all direct access to the blog - for everyone - BUT leaves the access from the Mambo wrapper intact.

So, for me, if someone goes to domain.com/blog/ they are blocked, if they go into my MOS site and click on my menu item which calls the blog inside the Mambo wrapper, it is access as usual.

Back to top
 
capnrob
Posted: Sat Jul 9, 2005 14:04

Thats ok - I think my .htaccess hates me- when I put in something like yours it even blocked Mambo - maybe I had the syntax a bit off or I was a path off from the root directory - but I found a workaround .. I renamed my index using a stub file and then forwarded all calls to blog/index.php to the Mambo wrapper ... (then in robots.txt I diallowed all crawlers from my new index so I wouldn't have the same problem. )

Now that I think of it- when I ping that sends out the stub file call dosent it?
sigh, I hadn't thought of it until this morning... maybe I can hard code my pings ....

other than that though everyone pretty much ends up at the right place.

I wish someone would make a component B2 Evo for Mambo that would make my life complete..

than and maybe a few dozen beers ;)

Back to top
 
elpie
Posted: Mon Jul 11, 2005 14:27

In case it might help, I will tell you exactly what I did.
I was running Mambo from the root and the blog from a directory called, "blog". I hate iframes so for some time I had the blog link hardcoded into my MOS nav and I did a template that looked almost exactly the same, but with a link which made it clear that that particular link went back to the main site.
Then I started getting hit too hard by spammers and decided to make a change. I brought the blog "into" MOS with the wrapper, and decided I wanted to stop direct access - hence the post here.
The htaccess I posted here is in the blog folder, leaving the main htacess, with all the MOS stuff in it, untouched.
Once I had established that this was working for me, I changed the folfer name from "blog" to something else.
And sat back and watched who and how often spammers were hitting on the "old blog". That's when I decided to send them to hell (or, at least 301 them to oblivion).
My bandwidth use has dropped amazingly, there has been no comment or referral spam, but the genuine visitors to my site are still visiting my blog.

I sure hope you can get your htaccess working because you are in for a whole lot of work otherwise.
If you have everything in root and are using just the one htaccess you will have to be very careful about the order.

Hope this helps somewhat.

Back to top
 
elpie
Posted: Mon Jul 11, 2005 14:33

Capn - it just occurred to me...
If you do have Mambo and b2evo in the same directory you are very likely to run into problems.
If you haven't, then ignore the rest.
If you have, you should really move your blog into a blog folder. Makes management and htaccess easier and reduces the risks of conflict with Mambo.
Just a thought :-)

I agree with you about bridging/integrating b2evo and Mambo - that's been on my wish list for some time now.

Back to top
 
capnrob
Posted: Mon Jul 11, 2005 16:35

nope, my blogs a subdirectory lower than the mambo install..

BUT - after screwing around for half the morning I was able to get a .htaccess file to work! Its probably overkill and sloppy but it does the job

W00T! !! Thanks everyone for helping me out here -- After such a long quest I feel like I should get an endscreen :D :D :D

Back to top
 
elpie
Posted: Mon Jul 11, 2005 16:40

Great news! I'm so glad you finally got it working :-)

Back to top
 
capnrob
Posted: Mon Jul 11, 2005 18:12

sadly though there are consequences - all search engine strings will now fail no matter what as the link for the search will not pass through mambo to B2 - so every search string returns the current front page.

I guess I can live with that....

you couldn't even fix that with a re-write ....

I'll have to get all famous without using the search engines I guess.

Back to top
 
capnrob
Posted: Mon Jul 11, 2005 20:50

I came up with a middle ground solution for this that may help... I let the searchbots in and if someone follows a link from a search engine they get through as well, all others go to the wrapper - heres the .htaccess for what its worth

RewriteCond %{HTTP_REFERER} !^http://technicallyoverboard.com/cms.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.technicallyoverboard.com/cms.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://search.msn.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.google.com/search.*$ [NC]
RewriteCond %{HTTP_USER_AGENT} !googlebot|askjeeves|msnbot|^inktomi|^yahoo [NC]

RewriteCond %{REQUEST_URI} ^/blog/.* [NC]

RewriteRule ^index\.php$ /

Back to top