Mods: Automatic Base Detection
Summary
Sometimes it is difficult to move an application from one subfolder into another or into the root folder. Another problem is moving your applcation to another domain. It would be much simpler if the application use only relative URIs and all the other stuff would be done automatically.
In the most cases there are at least two points in your application you must always have to remember to modify if you move your application around. If you are using redirections with mod_rewrite you have to modify at least the RewriteBase and the most frameworks need extra configuration and special code to generate a relative URL.
With CodeIgniter for example you have functions like base_url() or site_url() which should helps you to built links without to know about your current place. But this not very handy as you have always remember to use these functions in every link and to setup correctly your configuration value in $config['base_url']. An automatic base url detection is built in, but for security reasons the IP address of the your server will be used to compose the URI instead of your host name (read more about security concerns of detecting the host name automatically). This will not work in the most cases of course. Furthermore these functions produce absolute URLs, which is not necessary imo.
Example for HTML:
<a href="<?php echo base_url();?>controller/method/parameter">...</a>
Example for JSON-Objects:
{
url: "<?php echo base_url();?>controller/method/parameter"
}
Using only relative URIs
CodeCoupler includes an htaccess file that detects and save the current base folder in an environment variable. This variable will be used to set an absolute path into the configuration variable base_url in codeigniter, including the base folder of your application and excluding any host name. From now on CodeIgniter functions like base_url() or site_url() will produce an absolute URI without a hostname and with the correct subfolder if needed.
In an HTML page you can set this base in your header and use only relative links:
<base href="<?= base_url() ?>">
By the way: If you are using the template html it will be already set in every page automatically.
From now you do not have to bother how to point to ressources anymore. Just start everytime with the name of controller:
Example for HTML:
<a href="controller/method/parameter">...</a>
Example for JSON-Objects:
{
url: "controller/method/parameter"
}
Some Notes
With this modification all the libraries using base_url(), site_url() or some helpers like redirect() which call these function will work in the most cases like before. There are only some point you should notice:
1.) There is a limitaion in the usage of the second paramter $protocoll of the both functions. As we use only absolute paths and not any host name or IP address, we cannot set a protocol. The functions will check if the given protocol is the current protocol (http or https). If not they will show an error. As the limitation occur only if you switch the protocol from http to https (or vice versa) you should think about to call all your pages with only one protocol.
2.) Do not use the function current_url() for building links to the current controller. The function will return an absolute path to this controller, but the base tag in your header will prepend the absolute path to your application again. Using this function in redirect(current_url()) will work, but for links you should use get_instance()->uri->uri_string().
If you nevertheless should really need this functionality you must set your domain name in the variable base_url in config.php. You could of course setup the following code to use your domain name and benefiting from the automatic base detection, The variable {CC_BASE_URL} will be automatically replaced with the detected base:
$config['base_url'] = 'http://YOUR_FIXED_DOMAIN/{CC_BASE_URL}'));
Relative Links vs Absolute Links
If you try to search the internet for relative links vs absolute links you will find only one multiple time copied and translated opinion of SEO people. They say that it is bad to use relative because of two things:
1. Duplicate Content 2. Scraping
Some of them mention that developing a big website needs staging and therefore moving the application to different locations, domains and sometimes subfolders, but they come to a result that this should be solved by the programmers "somehow". On one website WordPress was mentioned as examplary because it uses only absolute URLs. I do not think that the author ever had to move a WordPress Website from one domain to another. It' the hell.
But let us see what we could do to make our SEO friends happy and not to make our application ugly.
The first point is that websites often are not configured well and they are reachable under four URLs:
1. http://www.domain.tld 2. http://domain.tld 3. https://www.domain.tld 4. https://domain.tld
The most simple solution is to redirect three of these domains to one final domain with the search engine friendly 301 header. This can be done with some simple lines in the 'htaccess file.
The second point is nothing you should worry about. If a scraper copies your site it is an inconsiderable detail if you have used relative or absolute links. I think it is much easier for a scraper to modify an application that uses absolute links, because he have only to search and replace the domainname. It is like you say to him "Hey, look here. This is the place you have to change the code to work my application under your domain!".
How this works
Every line of the htaccess file explained:
RewriteCond %{ENV:CC_TMP_URI} ^$
RewriteRule ^(.*)$ - [ENV=CC_TMP_URI:$1]
The first line looks if the variable CC_TMP_URI is empty. If yes, the second line stores the path from the location of the htaccess file to the requested file into the variable CC_TMP_URI.
- Example Request: http://host/subdir/request-dir/request-file?var=val
- If this htaccess is placed in subdir the stored value will be: request-dir/request-file
RewriteCond %{ENV:CC_BASE_URL} ^$
RewriteCond %{ENV:CC_TMP_URI}::%{REQUEST_URI} ^(.*)::(.*?)\1$
RewriteRule ^ - [ENV=CC_BASE_URL:%2]
The first line looks if the variable CC_BASE_URL is empty. The condition of the second line is always true. The only point of this line is saving the some values in two groups. It works as follows:
Remember:
- Our example Request: http://host/subdir/request-dir/request-file?var=val
- The variable CC_TMP_URI have the value because of the rule above: request-dir/request-file
Now the condition is:
Test-String: request-dir/request-file::/subdir/request-dir/request-file
Condition : ( group 1 )::(group2)request-dir/request-file
^- This is the -^
^- Backreference \1 -^
^- to the value of -^
^- "group 1" -^
Now we have in group 2 the base that we need. The third line save this value into the variable CC_BASE_URL. And we are ready for our final rewrite:
RewriteCond $1 !^index\.php
RewriteRule ^(.*)$ %{ENV:CC_BASE_URL}index.php [L,QSA]
If you need to exclude some requests redirecting to index.php you can add them in the rewrite condition like here:
RewriteCond $1 !^(index\.php|assets)