Archive for the ‘PDF’ Category

How to Redirect PDF Files with PHP, ASP or .htaccess for SEO

Sunday, October 4th, 2009

This is the follow up to my previous post “SEO for PDFs - Optimizing PDF Files for Search Engines” in which I recommended NOT using PDF files whenever possible. In this post, I will explain how to go go about redirecting PDF files with .htaccess, PHP and ASP.

301 Redirects
When it comes to SEO, the best redirect to use is a 301 permanent redirect. This is because search engines still have trouble handling 302 temporary redirects. And don’t even think about using a Meta Tag Refresh redirect or a redirect generated by a client-side language such as JavaScript if you care about search engine optimization.

.htaccess 301 Redirect
This is the easiest and most straightforward option to use, but it’s only available on a Linux server. Simply add the line below to your .htaccess file.

Redirect 301 /oldfile.pdf http://www.example.com/newpage.html

PHP 301 Redirect
This is also normally a pretty easy way to redirect pages by adding a couple lines of PHP on each page you want to redirect.  But since PHP code cannot actually be inserted into the PDF file, we have to treat it a bit differently. Follow the steps below:

1. Rename oldfile.pdf file to filename2.pdf.
2. Create a new directory named “oldfile.pdf” in the same directory that the PDF is in.
3. Add an index.php file in the new oldfile.pdf directory.
4. Add the following PHP code to the top of the index.php file:

<?
Header( "HTTP/1.1 301 Moved Permanently" );
Header( "Location: http://www.example.com/new-page-to-redirect-to.php" );
?>

Now when the search engines access http://www.example.com/oldfile.pdf they will actually be served the http://www.example.com/oldfile.pdf/index.php file which contains the 301 redirect to the new page.

ASP 301 Redirect
This is very similar to the steps for the PHP 301 redirect above, but this is websites that are hosted on Windows servers and use ASP.

1. Rename oldfile.pdf file to filename2.pdf.
2. Create a new directory named “oldfile.pdf” in the same directory that the PDF is in.
3. Add an index.asp file in the new oldfile.pdf directory.
4. Add the following ASP code to the top of the index.asp file:

<%@ Language=VBScript %>
<%
Response.Status="301 Moved Permanently"
Response.AddHeader "Location","http://www.example.com/new-page-to-redirect-to.asp"
%>

Now when the search engines access http://www.example.com/oldfile.pdf they will actually be served the http://www.example.com/oldfile.pdf/index.asp file which contains the 301 redirect to the new page.

I just realized I haven’t explained how to identify PDFs which are already ranking in the search engine results and thus are prime candidates for redirecting, so I’ll do that in another follow up post shortly!

SEO for PDFs - Optimizing PDF Files for Search Engines

Monday, September 21st, 2009

There has long been a myth that “search engines can’t read PDFs” so it is better to put all content on an indexable HTML page. This may have been true a few years ago, but nowadays most of the major search engines have no trouble crawling and indexing PDF files. There are several fantastic guides out there about how to optimize a PDF for the search engines, such as this one from 2007 on Search Engine Land.

However, even though I just clearly stated that search engines can crawl and index PDF files, I still recommend putting text-rich content on an HTML page over a PDF file (whenever possible) for a few reasons:

1. No website navigation in PDFs. More often than not, the PDF does not maintain the same look and feel of the website, let alone provide any navigational elements. While it is true that PDFs can include clickable links, the vast majority of them do not have the site’s global navigation, and thus users will be left with nowhere to go but back to the search results.

2. Not able to track user behavior on PDFs. Sure, we can analyze server log files to see how many times a PDF file has been accessed, but we are not able to track visitors with JavaScript based tools such as Google Analytics. Accurate tracking is absolutely essential to the success of any online marketing campaign.

3. Users may not be expecting PDFs. This may be just me, but I personally hate clicking through a search result and not immediately viewing a web page, but rather waiting for my browser to unfreeze while Adobe Acrobat takes its sweet time launching to load a PDF file. By the time the PDF is finally loaded, oftentimes I am already regretting that I clicked to view it while directing my cursor to the Back button.

There are some cases in which PDFs should remain as PDFs, such as brochures and other print material, but articles and technical papers certainly can be converted to HTML pages. I will follow up shortly with another post on how to go about doing so.