Syndicating your shared Google Reader items

I’m an avid user of Google Reader. I follow a couple dozen different websites, check it daily (and sometimes quasi-daily), and have a very elaborate system of tags.

One thing that I did a year or two ago was to take my public broadcast feed of my shared items and integrate it into my main website directly. The goal was to have the homepage of my website always have fresh content by linking it to my shared Reader items (I share anywhere from 3-8 items per day, on average).

Originally, my solution was a bit of a hack: I used Magpie RSS reader to consume the feeds, then did some serious PHP acrobatics to parse out the Google Reader feed — what I failed to understand, at the time, was the fundamental differences between RSS and Atom feeds (subtle, yet significant). All in all, I think it took somewhere from 100-150 lines of code to effectively snatch the feed and parse it out; and it wasn’t perfect, either — frequentle the title and source would get mashed together. But for that particular time, it was fine.

The other night, though, I decided to try and fix it up proper. I discovered that the Zend Framework has some native Atom Feed classes and thought this would be a good place to start. (Details, including my PHP source code, after the jump)

Setting Up

My solution requires the Zend Framework, it’s actually just an extension of one of the ZF classes. To implement it, you’ll need to install the ZF — which is fortunately free. So start off by grabbing the Zend Framework Full (current version is 1.10.7), and unzip / untar it to your server. The “Zend” folder (underneath the “Libraries” folder) must be incorporated into your path somehow — if you do not have that ability, you can always just place it at the top level of your website.

Oh yeah, and in case it’s not obvious, yuo need PHP for this to work, specifically PHP version 5 (any minor version therein will work, I believe).

Basic Instantiation

Consuming and rendering an Atom feed is super easy:

require_once(‘Zend/Feed/Atom.php’);

$url = “http://www.yourwebsite.com/atomfeed/

$feed = new Zend_Feed_Atom($url);
echo ‘The feed contains ‘ . $feed->count() . ‘ entries.’ . “\n\n”;
foreach ($feed as $entry) {
echo ‘Title: ‘ . $entry->title() . “\n”;
echo ‘Summary: ‘ . $entry->summary() . “\n\n”;
}

The key parts of that are lines 3 (creating the Zend_Feed_Atom object, and line 5 (the iteration). The Zend developers cleverly implemented the Iterator interface, so the objects are compatible with the foreach operation.

The inside of the foreach loop contains what appear to be method calls to both title() and summary() — however these are actually pseudo-methods, products of the __call() magic method — basically, any terminal node may be treated as a method — the class will automatically interpret that and traverse the DOM structure of the Atom feed accordingly. (So if your Atom feed included a node called “CheezWhiz”, you could access it via $entry->CheezWhiz())

Google Reader Requirements

I could just stop there — the native Atom feed class works fine at consuming the Google feed, but what I was really looking for was convenience — there are a few bits of pre-processing I’d like to do first, such as converting the RFC 3399 formatted dates into PHP date format, snagging the source, and trimming down the data — I really don’t need the summary itself; I’m only interested in the link data.

Here is my derived class, first, followed by discussion:

class Google_Reader_Feed extends Zend_Feed_Atom {
     private $max_items;

     function __construct($url, $max_items = 10) {
          $this->max_items = $max_items;
          parent::__construct($url);
     }

     public function serialize($json = FALSE) {
          $output = array();
          $count = 0;
          foreach ($this as $entry) {
               if ($count++ >= $this->max_items) break;
               array_push($output, array(
                   "title" => $entry->title(),
                   "link" => $entry->link(TRUE),
                   "published" => strtotime($entry->published()),
                   "source" => $entry->source->title(),
                   "source_key" => strtolower(str_replace(" ", "_", $entry->source->title()))
                ));
            }
           return ($json) ? json_encode($output) : $output;
      }

      public function getSourceList($json = FALSE) {
           $list = array();
           $count = 0
           foreach ($this as $entry) {
                if ($count++ > $this->max_items) break;
                $list [strtolower(str_replace(" ", "_", $entry->source->title()))] = $entry->source->title();
           }
           return ($json) ? json_encode($list) : $list;
      }
}

That’s it. Pretty simple!

The first thing I did was add a $max_items argument to the constructor — this was merely to put some constraints on how many items are actually rendered; by default I think Google’s feed restricts it to 20 items, but I wanted the ability to modify that on the fly.

The serialize() method simply takes the contents of the Atom feed and packages it together in a nice and neat little unit. It’s pretty straightforward with the exception of one property — $entry->link() must be passed SOMETHING as an argument (TRUE works fine in this case)  in order to tell the parent class to use the “alternate” link. For some Atom feeds, this may not be necessary, but Google’s Atom feed stores the link to the original article inside an alternate link node.

The date is converted to PHP format via the ever-handy strtotime() method.

The getSourceList() method is used for a neat feature I thought of — details later, but the requirement was just that I needed an easy way to get a distinct list of all the sources used in this block of feeds.

Implementation

The implementation is quite similar to the implementation of the parent class:

<?php
$url = “http://www.google.com/reader/public/atom/user%2F06502952424319927954%2Fstate%2Fcom.google%2Fbroadcast”;
$google_shared = new Google_Reader_Feed($url);
?>

<div id=”google_feed”>
<ul class=”feedlist”>

<?php
foreach ($google_shared->serialize() as $entry) {
// If the current entry is from today, show the time instead of the date
$date = (date(‘M d’, $entry['published']) == date(‘M d’))
? date(‘g:i a’, $entry['published'])
: date(‘M d’, $entry['published']);

// Truncate — you may need a different line length here.
$title = substr($entry['title'], 0, 97);

printf(‘<li class=”%s”><em>%s</em><a href=”%s”>%s</a></li>’,
$entry['source_key'],
$date,
$entry['link'],
$title);

}

</ul></div>

That block will show the list of shared google items, prefixing each with either the time of the publication (if it was from today) or the date of the publication (if from sometime before today). The “source_key” property is a lowercase, single-string version of the “source” property. More on that in a minute.

At this point, the list is more or less functional. However, upon viewing the output, it becomes clear that there is no easy way to identify where any of the links are from — on my feed, for example, I frequently share items from Slashdot, BoingBoing, and other places — how would a viewer know which were which? One way would be to simply add additional text — I did it that way at first, but it gets ugly. So I found a better way:

Creating a Legend

The getSourceList() method from the derived class gets an index of feed sources, but only those that are actually represented in the current block of shared items.

What’s really cool about this is that we now effectively have a “legend” that we can use at the footer of the list. Within the list itself, we can set the class of each list item to the source_key (since it’s a single string, it will work), and then define CSS classes for the feeds we share most often to display icons next to those items.

Displaying this list is done in a similar fashion to the original list:

<ol class=”legend”>

<?php
foreach($google_shared->getSourceList() as $image => $source) {
printf(‘<li class=”%s” style=”font-size: 12px;”> = %s</li>’,
$image,
$source);
}

</ol>

In advance, you’ll  need to prepare some image icons to represent each of your frequently shared sources — I’ve found that 16 x 16 pixels works best.  The CSS can simply be styled like this:

ol.legend { margin: 0; padding: 0; }
ol.legend li { padding-left: 25px; }

ul.feedlist li.slashdot,
ol.legend li.slashdot
{ background: url(‘/images/slashdot-icon.jpg’) no-repeat; }

ul.feedlist li.boing_boing,
ol.legend li.boing_boing
{ background: url(‘/images/bb-icon.jpg’) no-repeat; }

Be sure that your CSS classes match what will become the source_key — for some blogs, if the name is particularly long, this may need to be determined experimentally. Also, my original code above doesn’t take into account a blog title that contains non-alphanumeric characters, or non-latin-encoded characters — you may need to apply a RegEx to filter it into a consistent format.

One thing that I’d like to do with this eventually is to have a set of 10 different anonymous icons to be applied for blogs that appear out of happenstance — occasionally I’ll share an item that was shared with me by one of the people I follow, so I would have no icon normally; but I’d still like to have a legend entry for those.

I’ve got an idea on how to do it, but that’ll have to come later. :)

Cheers, and good  luck! Feel free to comment with your thoughts, questions, or help with getting it working.