Non-existent subdirectories of index.php do not return "404 not found"

Non-existent subdirectories of index.php do not return "404 not found" - .htaccess files are extremely useful in many cases for users who either do not have root permissions or for users who simply aren't comfortable in making changes in their web server's configuration file. Trying to debug .htaccess not working isn't always the easiest thing to do, however, hopefully by checking the discuss below mentioned about htaccess, url, 404, subdirectory, .htaccess common problems as well as the troubleshooting tips, you'll have a better grasp on what you may have to modify to get your .htaccess file running smoothly.Problem :


My website usually correctly returns 404 not found errors if I enter names of documents or folders that do not exist.



However, I have recently noticed something strange. The homepage of my website is called index.php and does not have any subdirectories because it is simply a file and I do not use a CMS.



But when I enter random subdirectories, such as /index.php/asdfghjk, the homepage is displayed instead of 404 not found as I would have expected. I've checked my .htaccess file and found nothing that would explain this behaviour.



The previous owner of the domain used joomla and the old site was structured in a way that all pages were subdirectories of /index.php . Could this have something to do with my problem, even though I am no longer using a CMS?


Solution :

What you are seeing is default behaviour on Apache.




when I enter random subdirectories, such as /index.php/asdfghjk




/asdfghjk is an additional path-segment in the URL. It's not strictly a "subdirectory". (Directories and subdirectories relate to a filesystem. The URL does not necessarily map directly to the filesystem.)



When additional path segments occur after a valid file (that maps to the filesystem) in the URL, it is called additional pathname information (or path-info), and is accessible with the PHP superglobal $_SERVER['PATH_INFO'] in your PHP script.



On Apache, whether path-info is valid in the URL or not is (by default) dependent on the handler that handles the requested resource. In this case, the PHP handler allows path-info by default, so no 404, and /asdfghjk is passed to index.php to be handled by your script (some CMS use this URL pattern to create "pretty URLs" without having to resort to URL rewriting). On the other hand, the text/html handler does not allow path-info, so /index.html/asdfghjk would result in a 404 by default, unless you explicitly enable it.



To disable path-info for all requests, you can set the following at the top of the .htaccess file.



AcceptPathInfo Off



The previous owner of the domain used joomla and the old site was structured in a way that all pages were subdirectories of /index.php. Could this have something to do with my problem, even though I am no longer using a CMS?




No, this has nothing to do with Joomla. The "problem" exists regardless. Although if you are seeing many such requests in your logs then this will now doubt be because of the old URL structure. Any old URLs should be 301 redirected to the corresponding new URL (if any) in order to preserve SEO.



If they used Joomla, then they didn't need to use index.php in the URL at all (unless there was a restriction with the web host).


Additionally, if you would like to do some further testing, give the htaccess tester tool a try. It allows you to specify a certain URL as well as the rules you would like to include and then shows which rules were tested, which ones met the criteria, and which ones were executed.

Comments

Popular posts from this blog

Rewrite in Mediawiki, remove index.php, .htaccess

.htaccess rewrite wildcard folder paths from host

Using .htaccess to set a cookie and 301 redirect