MySQLTalk.com

Semantic Web, Spam on Steroids and the Importance of an Authenticated Crawl

November 1, 2009 · Leave a Comment

Robots.txt is old. While there have been extensions to the protocol, we will need a major update in short order.

Why you ask?

Because if you thought spam was a problem with Web 1.0, when rogue crawlers just grabbed huge chunks of your site and republished it, competing with you in search engines on your own content, welcome to the Semantic Web aka Web 3.0.

Spiders that can crawl your content and UNDERSTAND it will bring about a whole new world of pain.

I agree with one of the top experts in blocking bad bots, Incredibill, when he says he’s all for an authenticated crawl. This topic has been brought up to mediocre public support. Mostly I think because they don’t understand how much it hurts people.

If you wonder how bad it is, ask yourself this. Why do robots spoof Googlebot if they aren’t walking away with something valuable of yours?

You can’t just give away your merchandise (ie content) to anyone who asks without at least asking the robbers to identify themselves.

The Search Engines supposedly do have a method to prove the crawler on your site is for real. It’s called forward/reverse dns.

Last I checked, Yahoo! and Microsoft agreed to support it. But some have found legitimate robots from one of them that wasn’t on the list and therefore you can find yourself blocking a legitimate bot. In other words, this solution isn’t trustworthy or serious…. Maybe Google will introduce a new bot tomorrow and the engineer working on it never knew about Matt Cutt’s post… This happens when there are thousands of employees in your organization.

What’s needed is a stronger protocol with authentication using RESTful principles of content negotation. You want my robots.txt? When you visit, give me a unique key along with your URI so I can check you out.

We can then build backend tools that will identify the robot that visited, how many pages they took and how often they visited. Perhaps I can state that bots I haven’t explicitly given permission to can visit, but they only get N pages. I trust you. But only a little.

There’s no reason I need to publicly publish my robots.txt file. Every bit of information will be used against you. When machines are talking to machines, the possibility of abuse grows exponentially. And I haven’t seen anyone really talk about it since the day I got involved in the Semantic Web community. I’m no expert, just a student, but I’d have thought someone would have brought it up!

I’m also not an expert on authentication, so a robust method will have to be decided upon. But we do need to realize that for all the promise of the Semantic Web, there are forces who will misuse it badly and we need to prepare ahead of time. Unlike we did for technologies like email.

→ Leave a CommentCategories: Site News

Life Lesson: Pay Attention!

November 1, 2009 · Leave a Comment

Do you ever wonder if there is a subtext to the life around you? That there is a hidden world that you just aren’t seeing. Well if you paid more attention, you might see a little.

We’re legally bound by fine print but how often do we take the time to read any of it? What made me think about it was one day I decided to try and read the credits after Two and a Half Men.

I froze the screen on the DVR and started reading. It was HILARIOUS! Turns out Chuck Lorre does this for all his shows. Have you ever watched Dharma and Greg, The Big Bang Theory or one of his other shows? Next time, catch the vanity cards.

Want another example of something hidden in plain sight? Try Brett Tabke’s robots.txt file. That’s where he keeps his personal blog!

And that’s all I have to say about that.

Note: please ignore this disclaimer except in court:
By reading this post you are legally bound to be my personal slave. Any disputes arising from this agreement will be decided by an arbitrator in Los Angeles. You certify that you agree to these terms by surfing away from this page or closing your browser. If you do not agree, then stay on this webpage. But be aware that aliens might abduct you if you loiter too long on any one webpage. This is true, I heard about it on Art Bell’s show.

→ Leave a CommentCategories: Site News

Faux Image Generator

November 1, 2009 · Leave a Comment

Faux Image generator with PHP (inspired by Faux Columns)

Problem:

Faux Columns is a workaround by Dan Cederholm for the fact that CSS “elements only stretch vertically as far as they need to.” This means we can’t get a background color taking up a whole column without using an image.

If like me, you prefer vi or notepad to Photoshop, creating these background images is a pain, but not so bad that you can’t live with it here and there.

However, in my copious “spare time”, I’m creating a CMS in Django with a Zend Framework front-end. The CMS will allow administrators to create their own style guide. I’d like to allow them to choose a background color, optionally with a border and have it show up as a background image for the full blown Faux Columns effect.

Note that initially I called this post Faux Image Generator FOR Faux Columns, but then I realized that as soon as the older browser(s) die off in usage, Faux Columns won’t be used as much. Yet this class does have other uses…

As they say in the Open Source Community, projects usually begin when a developer has an itch so let’s start scratching…

Solution:

Use PHP with the GD Library to generate these background images on the fly and apache trickery to fully leverage caching.

I’ve created a class to do just that:

Faux Image Generator

Note it might be hard to read the rest of this article without seeing the code. My apologies but I don’t write for a living, I code. I will try to come back to this and walk you through the steps a bit more if people are finding it hard to understand, so look at this post as a first draft…

Issues:

Great. So now we can generate a background image on the fly. We can give it a border. But we aren’t done. If we use dynamic scripting to generate layout images, we better make sure there’s some form of caching. There are many types of caching.

You’ve got caching we don’t care about like a caching proxy server, which your ISP may perform to speed up load or database caches to prevent expensive queries to be run over and over when the result set hasn’t changed.

We can also apply a cache to the image using Zend_Cache. But while this may help our server deal with the repeated hits to the same image, we get the best benefit if the user grabs the image from the browser cache.

To achieve this reliably you need to fool the browsers into thinking this is an image and not a program. There are several techniques out there, but for my money, it would be most elegant to make the image follow a naming convention and be called as if it were an actual image.

This is something Apache can handle. *As a bonus, perhaps we’ll even modify the class to actually create an image with the same name! Then if Apache finds the image, it will serve it. If not, it will serve the PHP file the first time only.

So first, let’s try to maximize the benefits of image caching by the browser. Per Google’s suggestions, let’s cache the file for a year but not more.

You can read the gory details on how to set the caching for images or I found Jeremy Zawodny’s caching instructions easier to read. You may have individual needs, as do I, so you may want to set it in your .htaccess file or your httpd.conf file for sitewide deployment.

Essentially you need to put this directive in:

ExpiresActive On
ExpiresByType image/gif A2592000
ExpiresByType image/png A2592000
ExpiresByType image/jpg A2592000
ExpiresByType image/jpeg A2592000

Now that we have that set up, let’s go about telling Apache to redirect the file to our php code in case it doesn’t find an image. So before we do anything else, we need to decide on a file naming convention. So what variables do we need?

$imgType, $bgColor, $bgWidth, $bgHeight, $bdLoc, $bdColor, $bdSize.

Apache’s regex engine, otherwise known as Mod_Rewrite, needs to know how to parse our filename. So here’s the naming convention I’ve gone with in the .htaccess as well as the file creation routine in the class:

$bgColor$bgWidthx$bgHeight$bdLoc$bdColor$bdSize.$imgType

so that would match:

/url/path/dedede5×10topf7f7ff5.png

It isn’t as easy to read as I’d like, but this is the first cut, maybe with some suggestions it can be improved.

The .htaccess looks like this:

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^.*$ - [NC,L]

#let's only support 6 char colors even though script handles 3 chars
RewriteRule ^([a-f0-9]{6})([0-9]{1,4})x([\d]{1,5})(top|right|bottom|left)([a-f0-9]{6})([\d]{1,2})\.(png|gif|jpg)$  path/to/fauximage.php?bgColor=$1&bgWidth=$2&bgHeight=$3&bdLoc=$4&bdColor=$5&bdSize=$6&imgType=$7 [NC,L]

This .htaccess will display the image file if it exists, which it will the second time it’s called. On the first call it will send the variables to our class to save the file and then display the image.

You should watch out for one thing. You don’t want people to use your personal php file to fix their background image issues. You may need to watch your logs and take action if someone figures out you’re using this class.

I’d love feedback, especially if it can make this class better, so please comment.

Also, if you want to be updated on changes to the code, or just hear more about the Los Angeles Dev community, MySQL, Zend Framework, Django, PHP or the Semantic Web, consider following me on Twitter @joedevon.

*Since I wrote the first draft of this post, I’ve updated the class to create the image on first load.

→ Leave a CommentCategories: Site News
Tagged: ,

Setting up a Zend Framework project on Mac OS X Leopard

August 27, 2009 · 3 Comments

I’m about to do my first start to finish project on a Mac OS X Leopard. Although I’ve previously moved a project over from an XP, it’s never quite as nice as working on a fresh project with a brand new laptop, so I figured I’d document the procedure as I go through it.

Unfortunately, I didn’t have time to document the process of getting PHP/MySQL/Zend Framework /PHP Unit/Eclipse (as well as Python, Django and Mod WSGI for the CMS) et. al. on the Mac. It wasn’t all that easy, but I will say that in the end I opted for using Zend Server CE, and that part of it was really quite smooth.

It wasn’t my first choice because if there are bugs in a stack, you wind up debugging the stack rather than the components. The experience is not quite as useful. And the more layers between you and the tools you’re using, the less you tend to understand. But ZCE solved too many problems I had that it was well worth using as far as the Mac is concerned.

The scope of this post is to get the browser to display the index view without error and set up a PHPUnit test that looks for the client’s logo in the header, so you may want to get your logo ready.

NOTE: I uses {braces} throughout where you need to replace your own values for mine.

OK let’s get started… First you need a project home. I like to use this:

/Users/{yourname}/Workspace/{clientsname}/

This can be created manually and “used” by Eclipse when creating a new PHP Project, or you can create the directory by just starting a new PHP project from Eclipse and naming it.

Next you need the location of your media files. Essentially this is your public_html (otherwise known as www) folder. I like to use this location:

/Users/{yourname}/Sites/{clientsname}/

You will need to add this location to your PHP include path in Eclipse. (Right click on “PHP Include Path”, go to “Include Path->Configure Include Path”). Add it to the “libraries” tab, click on “Add external source folder”.

Good. Now we’ll use a symbolic link to map your public folder to your media (i.e. public_html) folder. Type the following, replacing your username and client name where appropriate of course:

ln -s /Users/{yourname}/Sites/{clientsname}  /Users/{yourname}/Workspace/{clientsname}/public

Next step I like to keep my own code in the library folder, and have the controllers extend from my base controller. Even if I don’t put anything in my base controller at first, this will save a lot of time later in case I find that something gets called over and over again from many controllers, I can just add it to my base and not worry that the controllers won’t have it available. So create a directory in Library with your name, I use Joed, you pick something else…

/Users/{yourname}/Workspace/{clientsname}/library/{Joed}

You need to add the library directory to your php include path. I did this through php.ini at first. But multiple projects clashed of course and I had to modify this later. If you plan to have more than one project on this machine, skip the next step.

~php.ini step~
If you are also set up with Zend Server CE, the location is /usr/local/Zend/etc/php.ini. Look for the line that says “include_path” and add a colon to the existing path along w/ your library location. E.g.

.:/some/existing/path:/Users/{yourname}/Workspace/{clientsname}/library/

~END php.ini step~

The Zend Framework code will also live in the Library. The best way to handle upgrading and downgrading Zend Framework is to pick a directory, put all versions of ZF there without overwriting each other (e.g. /path/to/ZF/zf1.50, /path/to/ZF/zf1.75 etc…) and then symlink the version you want to run, to your library.

Since I’m using Zend Server CE, I’ll just go with whatever version is already installed. So we need to find the Zend directory, which is located here: “/usr/local/Zend/share/ZendFramework/library/Zend” and create a symlink as follows:

ln -s /usr/local/Zend/share/ZendFramework/library/Zend  /Users/{yourname}/Workspace/{clientsname}/library/Zend

Note that on my installation, ZCE already included the Zend Framework library in the php.ini. So your dev box won’t have the most performant php.ini setup. But no matter, you just need to avoid extra includes where possible on the server.

Now to set up your directory structure, you can save a couple steps by using the new Zend Tool. Although we have a couple of the directories already set up, I just tried creating a project and it installed the files correctly in the directories I wanted…so without further ado, type the following, making sure the path to your installation is correct:

/usr/local/Zend/share/ZendFramework/bin/zf.sh create project {clientname}

Great! Now let’s start to configure things a bit to our liking. In the “/application/Bootstrap.php”, I don’t really like how the modules work by default, so I don’t use it. I just put these lines inside the Zend_Application_Bootstrap_Bootstrap class:

protected function _initAutoload()
{
	$autoloader = new Zend_Application_Module_Autoloader(array(
		'namespace' => '',
		'basePath'  => dirname(__FILE__),
	));
	return $autoloader;
}
protected function _initDoctype()
{
	$this->bootstrap('view');
	$view = $this->getResource('view');
	$view->doctype('XHTML1_STRICT');
}
protected function _initActionHelpers()
{
	Zend_Controller_Action_HelperBroker::addPath(
		APPLICATION_PATH."/controllers/helpers"
	);
}

This way I can autoload the models without having to add prefixes to them. There are other approaches, do what works for you. I will link to a couple other ideas for you to consider:

Zend Framework Modules

Another #ZF Module config

Now open “config/application.ini”. IF AND ONLY IF you’ve added the library to the php.ini, you can remove the includes line from here:

;;includePaths.library = APPLICATION_PATH "/../library"

Now let’s register our custom code directory “/Users/{yourname}/Workspace/{clientsname}/library/{Joed}” into the namespace with the following:

;; custom library
autoloadernamespaces.{joed} = "{Joed}_" ;remove braces and use your custom library directory on this line

Add the layout and view:

resources.layout.layout = "layout"
resources.layout.layoutPath = APPLICATION_PATH "/layouts/scripts"
resources.view[] =

And now the database info. Default adapter is if you only have one, it will save you a bit of code when calling your dbAdapter:

;; dB configs
resources.db.adapter = {PDO_MYSQL}
resources.db.params.host = {localhost}
resources.db.params.username = {username}
resources.db.params.password = {password}
resources.db.params.dbname = {clientdb}
resources.db.params.driver_options.PDO::MYSQL_ATTR_USE_BUFFERED_QUERY = true
resources.db.isDefaultTableAdapter = true
resources.db.params.profiler = true

Read my previous post on why I have have MYSQL_ATTR_USE_BUFFERED_QUERY on. And only turn on the profiler if you’re using it.

The rest of the settings in “application.ini” will do for now…

Open your “HOSTS” file, and point 127.0.0.1 to whatever hostname you want to use (e.g. {yourname.clientname.com} ).

Open your “httpd.conf” and add directory specific info if needed…e.g.:

<Directory "/Users/{yourname}/Sites/{clientsname}">
    #
    # Possible values for the Options directive are "None", "All",
    # or any combination of:
    #   Indexes Includes FollowSymLinks SymLinksifOwnerMatch ExecCGI MultiViews
    #
    # Note that "MultiViews" must be named *explicitly* --- "Options All"
    # doesn't give it to you.
    #
    # The Options directive is both complicated and important.  Please see
    # http://httpd.apache.org/docs/2.2/mod/core.html#options
    # for more information.
    #
    Options Indexes FollowSymLinks Includes ExecCGI
    #Options Indexes FollowSymLinks MultiViews

    #
    # AllowOverride controls what directives may be placed in .htaccess files.
    # It can be "All", "None", or any combination of the keywords:
    #   Options FileInfo AuthConfig Limit
    #
    AllowOverride All

    #
    # Controls who can get stuff from this server.
    #
    Order allow,deny
    Allow from all

</Directory>

Then in the extras folder, open httpd-vhosts.conf and add your VirtualHost info, which should look something like this:

<VirtualHost *:80>
    ServerAdmin {yourname}@{yrmailserver}.com
    ServerName {nameYouChoseInHostsFile}
    DocumentRoot "/Users/{yourname}/Sites/{clientsname}"
    ErrorLog "/usr/local/Zend/apache2/{client}-error_log"
    CustomLog "/usr/local/Zend/apache2/{client}-access_log" common
</VirtualHost>

Restart apache.

BTW, as a tip, in Eclipse, .htaccess files are hidden by default. Assuming you have a PHP project, you should have your directories and files in an explorer-like tree under a tab called “PHP Project”. There is an arrow in that box, select it and on the dropdown click on “Filters”. Then uncheck *.resources. This will enable viewing .htaccess files, although you may see a few Eclipse related system files…which I’d advise ignoring.

It’s up to you where you want to set the APPLICATION_ENV constant. I like to put it in “/public/index.php”, so in .htaccess, I comment it out of “public/.htaccess”. Then in “public/index.php” I change the following line to match the aforementioned setup on the Mac:

|| define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../application'));

becomes:

|| define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../../Workspace/{clientname}/application'));

If you’ve done everything right and try to load up your browser to your client’s location, you will get an error about a missing layout file. So let’s create a layout now. Underneath your “applications” directory, create a “scripts” directory and then a “layout.phtml” file with the following contents:

<?php 

echo $this->doctype() ?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <title>Hello World!</title>
  <?php echo $this->headLink()->appendStylesheet('/css/global.css') ?>
</head>
<body>
<div id="header" style="background-color: #EEEEEE; height: 30px;">
    <div id="header-logo" style="float: left">
        <b>Yay it works!</b>
    </div>
    <div id="header-navigation" style="float: right">
		howdy.
    </div>
</div>

<?php echo $this->layout()->content ?>

</body>
</html>

and the error should be gone.

Now let’s create a PHPUnit Test. This post does not cover installing PHPUnit. So before you do the next step, make sure it’s installed by typing “phpunit” from the terminal. If that fails, you need to get it working…

Now inside your tests directory (created by Zend Tool earlier), there should be an empty file called “phpunit.xml”. Open it. Put this there with your changes for {clientname} of course:

<phpunit bootstrap="./application/bootstrap.php" colors="true">
	<testsuite name="{clientname}">
		<directory>./</directory>
	</testsuite>

	<filter>
		<whitelist>
			<directory suffix=".php">../application/</directory>
			<exclude>
				<directory suffix=".phtml">../application/</directory>
				<file>../application/Bootstrap.php</file>
				<file>../application/controllers/ErrorController.php</file>
			</exclude>
		</whitelist>
	</filter>

	<logging>
		<log type="coverage-html" target="./log/report" charset="UTF-8"
		yui="true" highlight="true" lowUpperBound="50" highLowerBound="80" />
		<log type="testdox" target="./log/textdox.html" />
	</logging>
</phpunit>

Now we need to create a bootstrap for PHPUnit to use. It should exist already under ‘tests/application/bootstrap.php’, but empty. Add this:

<?php

// Define path to application directory
define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../../application'));

// Testing env has the mock database
define('APPLICATION_ENV', 'testing');

/** Zend_Application */
require_once 'Zend/Application.php';
require_once 'ControllerTestCase.php';

Note that the APPLICATION_ENV = ‘testing’. This lets you send different configuration directives when testing, so be sure and keep that in mind for the “application.ini” as you build your app.

Now create ‘ControllerTestCase.php’ in the ‘tests/application’ directory and put this in it, changing the clientname of course:

<?php
require_once 'Zend/Test/PHPUnit/ControllerTestCase.php';
abstract class ControllerTestCase
	extends Zend_Test_PHPUnit_ControllerTestCase
{
	/**
	 * @var Zend_Application
	 */
	protected $application;

	public function setUp()
	{
		$this->bootstrap = array($this, 'appBootstrap');
		// setup the host var
		$_SERVER['HTTP_HOST'] = '{clientname}';//this can probably be done on phpunit command line
		parent::setUp();
	}
	public function appBootstrap()
	{
		$this->application = new Zend_Application(APPLICATION_ENV,
											  APPLICATION_PATH . '/configs/application.ini');
		$this->application->bootstrap();
	}
}

I’ve set the $_SERVER['HTTP_HOST'] here because in my previous project it helped me test certain things that were expecting the HTTP_HOST that I would get when visiting the page through the browser but wasn’t appearing through PHPUnit. I later found out that there’s a switch in the command line to provide this functionality, so if you feel like digging through the manual and using that instead, you can remove that line.

Now under ‘/tests/application’ create a directory called ‘controllers’ and create a file in there called ‘IndexControllerTest.php’ with these contents:

<?php
/**
 * @group controllers
 * @group indexcontroller
 * @author {Joe Devon}
 *
 */
class IndexControllerTest extends ControllerTestCase
{

	public function setUp()
	{
		/* Setup Routine */
		parent::setUp();
	}

	public function tearDown()
	{
		/* Tear Down Routine */
		parent::tearDown();
	}

	public function testCanDoUnitTestInIndexController()
	{
		$this->dispatch('/');
		$this->assertTrue(true);
	}

	/**
	 * @group index
	 * @return void
	 */
	public function testIndexAction()
	{
		$this->resetRequest()
			 ->resetResponse();
		$this->dispatch('/');
		$this->assertNotRedirect();
		$this->assertController('index');
		$this->assertAction('index');
		$this->assertResponseCode(200);
	}

	public function testErrorURL()
	{
		$this->dispatch('/index/foo');
		$this->assertController('error');
		$this->assertAction('error');
		$this->assertResponseCode(404);
	}

	public function testIndexLoadsDefaultHeaderSuffix()
	{
		$this->dispatch('/index');
		$this->assertQueryContentContains('div#navigation', 'logo.jpg');
	}
}

Now from a terminal window, run:

$ phpunit --verbose

You should see 4 tests, 9 assertions and 1 failure. The failure is caused by the lack of a logo in the view. If you aren’t clear on why, make sure to read the code above. We will need to create a view for index and a div id=”navigation” inside the view, and put ‘logo.jpg’ inside the div.

If you don’t know why we created a test before the code to make the test pass, it’s because we’re doing what’s called Test Driven Development. TDD is beyond the scope of this post, but look it up if you aren’t familiar with it. You will definitely learn something interesting…

We’re almost done. We just need to clean up a couple of things and we’ll get this last test to pass.

Remember we created a “/Users/{yourname}/Workspace/{clientsname}/library/{Joed}” directory? In there, create a file called ‘Controller.php’ with these contents:

<?php
/*
 * Class: {Joed}_Controller
 *
 * Base controller
 */
class {Joed}_Controller extends Zend_Controller_Action
{

	/*
     * Pre-dispatch routines
     *
     * Called before action method. If using class with
     * {@link Zend_Controller_Front}, it may modify the
     * {@link $_request Request object} and reset its dispatched flag in order
     * to skip processing the current action.
     *
     * @return void
     */
	public function preDispatch()
	{
		parent::preDispatch();
	}

    /**
     * Post-dispatch routines
     *
     * Called after action method execution. If using class with
     * {@link Zend_Controller_Front}, it may modify the
     * {@link $_request Request object} and reset its dispatched flag in order
     * to process an additional action.
     *
     * Common usages for postDispatch() include rendering content in a sitewide
     * template, link url correction, setting headers, etc.
     *
     * @return void
     */
	public function postDispatch()
	{
		parent::postDispatch();
	}
} // end class

I’ve got a couple of methods I used for my last project in here that I should publish, but I got to make it more generalized and reusable first. Needless to say, you need your own name in the code and you will use this file to put in custom code that you will need your controllers to extend from.

Now open /Users/{yourname}/Workspace/{clientsname}/application/controllers/IndexController.php and change ‘class IndexController extends Zend_Controller_Action’ to read ‘class IndexController extends {Joed}_Controller’.

Finally, open /Users/{yourname}/Workspace/{clientsname}/application/views/scripts/index.phtml and point to your logo with a minimum of this sprinkled somewhere in your view:

<div id="navigation">
<img src="logo.jpg">
</div>

Now you should get no failures on your tests and you are ready to begin!

If you enjoyed this post and want to follow my posts in the world of Zend Framework, MySQL & the web, consider “following” me on twitter @joedevon.

→ 3 CommentsCategories: Site News
Tagged: ,

Important to note that Zend Framework _redirect defaults to 302

July 12, 2009 · 2 Comments

From the Zend Framework Manual:

_redirect($url, array $options = array()): redirect to another location. This method takes a URL and an optional set of options. By default, it performs an HTTP 302 redirect.

Now you can argue about what the 302 header code SHOULD do. It’s spelled out in W3C:

10.3.3 302 Found

The requested resource resides temporarily under a different URI. Since the redirection might be altered on occasion, the client SHOULD continue to use the Request-URI for future requests. This response is only cacheable if indicated by a Cache-Control or Expires header field.

The temporary URI SHOULD be given by the Location field in the response. Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s).

If the 302 status code is received in response to a request other than GET or HEAD, the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued.

Note: RFC 1945 and RFC 2068 specify that the client is not allowed to change the method on the redirected request. However, most existing user agent implementations treat 302 as if it were a 303 response, performing a GET on the Location field-value regardless of the original request method. The status codes 303 and 307 have been added for servers that wish to make unambiguously clear which kind of reaction is expected of the client.

But in the past 302’s have been royally borked by search engines.

For some history, I was peripherally involved in documenting and tracking down the initial cause of these problems when a popular directory app (more or less correctly) used 302 redirects to link to resources. The creator of the program felt that he wasn’t doing anything wrong, but it turned into massive threads in the Search Engine forums.

Due to a glitch, a 302 on these directory sites gave all users of the program the SERPs of sites it linked to. Along with the ire of indignant site owners everywhere. Honest people were called spammers, fights broke out. It was ugly.

It also took a long time for the search engines to fix. Really long. And just when you thought it was safe to go back in the water, it broke again.

Eventually it seems the major problems got fixed for good. There have been countless discussions about this topic. Although I haven’t read the latest, in my opinion, 302s hurt your ranking. Just last week a friend of mine asked me to check out her potential client’s low ranked corporate website (they are a household name), and lo and behold, their homepage has a 302 redirect!

The purpose of this post is mostly to alert Zend Framework developers in case they assumed, as did I, that a redirect would send a 301 header code. Used in the wrong place, the results can be costly!

Therefore, make it a habit to pass in the code you desire each time you use _redirect().

In my opinion, the redirect should either default to 301, or, since the ramifications of an incorrect redirect can be harmful to search engine rankings, a valid code should be required to be passed in. This will engender good habits.

I’d love to hear comments on this issue. In particular, to answer these questions:

  1. What do you think the default in Zend Framework should be and why?
  2. Does W3C’s definition of the 302 work in the real world?
  3. Should W3C’s make any changes to redirect codes and if so, what should they change?
  4. What code should you send to redirect pages in various situations? After logging in… Accessing a page when not logged in (it could be a valid page for a logged in user)… etc..

For those who are curious, the default is set in /path/to/ZendFramework1.84pl1/library/Zend/Controller/Action/Helper/Redirector.php, right after the class statement:
protected $_code = 302;

→ 2 CommentsCategories: Zend
Tagged: ,