How to Determine Which Nodes Are Using Pathauto Paths

Submitted by Barrett on Sat, 11/17/2012 - 19:39
How to Determine Which Nodes Are Using Pathauto Paths

Recently at work, one of the site managers asked for a listing of which nodes on the site were using the auto-generated Pathauto paths and which were not. Should be easy, right? Just figure out where Pathauto stores whatever variable it uses to indicate if a node has the "Automatic alias" parameter set and dump the list. Turns out, actually no, it's not that simple. Pathauto doesn't store a variable. Instead, it has a set of logic checks that determine if the "Automatic alias" flag should be set or not each time a node is edited.

What you have to do, then, is load each node in question and compare the path it's using against the path which Pathauto would use if it were going to generate an alias. The pathauto_create_alias() function will tell you the later part, so long as you pass 'insert' as the second parameter instead of 'update'. (Otherwise, the function returns empty if it determines a new alias is not needed).

The full script I put together to satisfy the site manager's request was:

// make pathauto functions accessible
module_load_include('inc', 'pathauto');

// get published nodes from the database
$query = "select nid from {node} where status = 1";
$resultSet = db_query($query);

// load each node, compare path to what pathauto would generate and output
while ($nid = db_result($resultSet)) {

  $node = node_load($nid, NULL, TRUE);

  $placeholders = pathauto_get_placeholders('node', $node);
  $alias = pathauto_create_alias('node', 'insert', $placeholders, "node/$node->nid", $node->type, $node->language);

  // a simple boolean to indicate if the paths match, to make sorting/filtering of the results easy
  $aliasMatch = ($node->path == $alias? 1 : 0);

  // strip line breaks, tabs, and pipes from the titles cause they make a mess when we try to
  // open the output in Excel, then convert multiple spaces into a single space
  $cleanTitle = str_replace(array("\r\n", "\r", "\n", "\t", "|"), " ", $node->title);
  $cleanTitle = ereg_replace(" {2,}", ' ', $cleanTitle);

  echo "$node->nid|$node->type|$cleanTitle|$node->path|$alias|$aliasMatch" . PHP_EOL;

Barrett Sat, 11/17/2012 - 19:39