Recent Topics

Allowing/using relative URLs

Started by on Aug 03, 2006 – Contents updated: Aug 03, 2006

Aug 03, 2006 04:22    

Not sure if this is related to the original problem, but I noticed that relative URLs are marked as bad URLs. For instance:

<img src="/homepage/images/scale3.jpg" alt="scale" width="71" height="71" align="right" />
<img src="../homepage/images/scale3.jpg" alt="scale" width="71" height="71" align="right" />
<img src="./homepage/images/scale3.jpg" alt="scale" width="71" height="71" align="right" />
<img src="homepage/images/scale3.jpg" alt="scale" width="71" height="71" align="right" />

Those all generate a bad URL error. I had a fix in my 1.6 version which basically added '' to the list of allowed URL schemas and that seemed to do the trick. Is this something that could be added to trunk? I'll try to dig up all of my fixes.

Aug 02, 2006 20:31

Here are the changes I implimented in 1.6. I haven't merged them into my copy of 1.8 yet, so the file names may have changed in the new version

conf/_formatting.php

$allowed_uri_scheme = array
(
	'',      # for relative URLS
	'http',
	'https',
	'ftp',
	'gopher',
	'nntp',
	'news',
	'mailto',
	'irc',
	'aim',
	'icq'
);

...

/**
 * URI schemes allowed for URLs in comments and user profiles:
 * @global array
 */
$comments_allowed_uri_scheme = array
(
	'',        # allow URLs relative to site root
	'http',
	'https',
	'ftp',
	'gopher',
	'nntp',
	'news',
	'mailto',
	'irc',
	'aim',
	'icq'
);

evocore/_misc.funcs.php

/**
 * Check the validity of a given URL
 *
 * Checks allowed URI schemes and URL ban list.
 * URL can be empty.
 *
 * Note: We have a problem when trying to "antispam" a keyword which is already blacklisted
 * If that keyword appears in the URL... then the next page has a bad referer! :/
 *
 * @param string Url to validate
 * @param array Allowed URI schemes (see /conf/_formatting.php)
 * @return mixed false (which means OK) or error message
 */
function validate_url( $url, & $allowed_uri_scheme )
{
	global $debug, $Debuglog;

	if( empty($url) )
	{ // Empty URL, no problem
		return false;
	}

	// minimum length: http://az.fr/
	if( strlen($url) < 13 )
	{ // URL too short!
		$Debuglog->add( 'URL &laquo;'.$url.';&raquo; is too short!', 'error' );
		return T_('Invalid URL');
	}

	
	$pattern_url = '<^                             # start 
		(                                          # there may not be a scheme if the URL is relative to the site root which is OK 
			(                                      # 
				([a-z][a-z0-9+.\-]*)               # scheme 
				:[0-9]*                            # port optionally 
				//                                 # allow absolute URLs only 
			)                                      # 
			|                                      #
			/                                      # or URLs relative to site root 
		)                                          # 
		[a-z0-9][a-z0-9~+.\-_,:;/\\\\*]*           # Don t allow anything too funky like entities 
		([?#][a-z0-9~+.\-_,:;/\\\\%&=?#*\ \[\]]*)? # 
		$>ix';
	if( ! preg_match($pattern_url, $url, $matches) )
	{ // Cannot vaidate URL structure
		$Debuglog->add( 'URL &laquo;'.$url.';&raquo; does not match url pattern!', 'error' );
		return T_('Invalid URL &laquo;'.$url.';&raquo; ');
	}

	$scheme = ( ( isset( $matches[3] ) ) ? strtolower($matches[3]) : '' );
	if( !in_array( $scheme, $allowed_uri_scheme ) )
	{ // Scheme not allowed
		return T_('URI scheme &laquo;'.$scheme.';&raquo; not allowed');
	}

	// Search for blocked URLs:
	if( $block = antispam_check($url) )
	{
		if( $debug ) return 'URL &laquo;'.$url.';&raquo; refused. Debug info: blacklisted word: ['.$block.']';
		return T_('URL &laquo;'.$url.';&raquo; not allowed');
	}

	return false; // OK
}

Aug 03, 2006 04:21

It's planned to allow relative URLs, but simultaneously it we must make sure that they are made absolute again in the XML/RSS feeds.

Basically, it should get fixed in validate_url() and where the content gets displayed for/through the feeds.

I think, in the feeds we should prepend the item's blog's url to any links (href and src attributes), which do not start with a protocol ("\w+://"). If the first char of the URI is not a slash, that should get added/kept from the blog url.

However, what about blog urls like http://example.com/index.php?blog=1 (versus own/sub-domain)? - I think we should use $baseurl, if the blog has no absolute URL without params (detected by question mark).

More thoughts? :)

(I'll split this thread)

Oct 17, 2006 05:18

This fix should allow for links to "#some_internal_page_anchor" as well. Currently, you have to use a fully qualified URL and add the anchor to the end, which may not be what you want to happen.

For instance, I was trying to add a link in my blog's long description text to an anchor in my skin. The long description shows up everywhere that the skin is used, and the anchor will always be valid. But having to hard code the URL means a user could be on a specific post when they click the link, which now redirects them to the site root rather than just an internal anchor.

Oct 17, 2006 19:29

Very good point. Just committed for 1.9. Thanks for reporting.


Form is loading...

CMS + forums – This forum is powered by b2evolution CMS, a complete engine for your website.