Difference Between .HTACCESS and ROBOTS.txt File

.htaccess

By Ralf Appelt (cc)

It gets tough when you are a total beginner at the technical sides of running your own blog or website. I remember it took me a while to figure out the difference between the .htaccess file and robots.txt, because even though I always loved anything IT and computers, the first time you encounter a new object or concept, your mind needs a bit of time to absorb the workings of it.

And .htaccess and robots.txt are the two files in the root folder of your website that you really can’t live without, so learning what they are and how they work is critical.

In this post I will explain you what they are, why they matter and how you can use them to your advantage, with a few examples to make learning easier and faster.

The Difference Between .HTACCESS and ROBOTS.txt File

A first general definition of this difference is that .htaccess is used mostly for internal access, whereas robots.txt manages external access.

“Internal” because .htaccess tells your Apache server how to handle page and file names, URLs and a user’s way to access these resources; it’s for your site to handle its own features.

Robots.txt on the contrary regulates “external” access, because it tells search engines and other web tools what to read and index and what not (but a human user can still browse and read).

Some examples:

.htaccess for a WordPress installation with “nice” permalinks:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

Robots.txt for a WordPress blog that disallows Googlebot but allows other search engines, but also allows none to crawl their /familypics folder:

User-agent: *
Disallow: /familypics

User-agent: Googlebot
Disallow: /

In other words .htaccess protects your site from people, robots.txt from machines.

Why You Can’t Live Without .HTACCESS and ROBOTS.txt

Without .htaccess, your site will behave in its default way or may not work at all, depending on the software you use. For example, the standard installation of WordPress doesn’t include an .htaccess file: WordPress automatically creates the file when you configure your permalinks under Settings -> Permalinks. If you don’t set your permalinks, the default configuration for your URLs will stay the “unfriendly” http://example.com/?p=203, with “203″ being the post ID in the MySQL database.

Without robots.txt, search engine and tool crawlers (and scrapers) would index or copy your entire website without exceptions, including files you want to keep “private”. See the example from my previous paragraph about this.

The modern Web user doesn’t like “unfriendly” URLs and search engines don’t want to find any “junk” present on your website when their spiders meet it, so having your .htaccess and robots.txt files correctly configured only works to your advantage.

How You Can Benefit From These Two Files

The technical side of these two files is only a small portion of the benefits they can bring to your table. There are also benefits connected with UX (User Experience) and SEO.

With .htaccess you can:

With robots.txt you can:

Takeaway:

Configure your .htaccess and robots.txt files as soon as you setup your website or blog. They are critical to the wellbeing of your site in the search index and from a user’s viewpoint.

More resources to check out:

VN:F [1.9.22_1171]
Rating: 9.0/10 (1 vote cast)
Difference Between .HTACCESS and ROBOTS.txt File, 9.0 out of 10 based on 1 rating
About Luana Spinetti

Luana Spinetti is a freelance blogger and copywriter based in Italy. When she's not writing, she will be drawing artwork and making websites. Web Marketing and SEO are in her basket for work-enhancement and for fun (but it still earned her a gig as a SEO consultant in 2012). Find her at LuanaSpinetti.com or at her Twitter account @luanatf.

Loading Facebook Comments ...

Speak Your Mind

*


*