Google
 

Monday, April 27, 2009

Tutorial: Separating Notes from Google Reader Shared Items Posts in SweetCron

Jump to the solution
Jump to the short version
Download the modified files

I set up a LifeStream recently (http://www.michaelakerman.com. It's ugly as sin currently, but coming along) using SweetCron. My initial CSS-customizing fervor was dampened when I ran into a problem with my Google Reader feed items.

Here's the deal: when you share an item on Google Reader with a note, it enters your RSS feed looking like this:


Shared by Michael

Here's a note on this item!

And here's the item!



The note and the line "Shared by Michael" (or whatever your name happens to be. Use your imagination) are contained within a <blockquote> tag.

The problem is, every item's content is fed through a function that removes HTML formatting, other than line breaks, so all that is left is:

Shared by Michael

Here's a note on this item!And here's the item!



Since the <blockquote> provided the line break between the note and the actual content, there's no reliable way in the processed item to find the end of the note. The "Shared by" line can still be stripped fairly easily using string matching, but that leaves the content sounding like a Tourette's patient.

The solution, then, is to get at it before the HTML formatting is stripped off, which will take more than just a plugin. Here's how I did it:



Solution

(As a disclaimer, I don't know PHP formally. A lot of the language I use to describe the code is borrowed from Java or made up)

The first thing we must do is get the note split off before the HTML tags are removed. The cleaning happens in sweetcron.php, which is located, by default, in /system/application/libraries. At about line 70 is a column of function calls to define various array elements and instance variables. It looks like this:


$new->item_data = array();
$new->item_data['title'] = $item->get_title();
$new->item_data['permalink'] = $item->get_permalink();
$new->item_data['content'] = $item->get_content();
$new->item_data['enclosures'] = $item->get_enclosures();
$new->item_data['categories'] = $item->get_categories();
$new->item_data['tags'] = $this->get_tags($new->item_data);
$new->item_data['image'] = $this->get_image($item->get_content());


We're going to add the note to the array by calling a new function (we haven't written it yet). Add the following line at the end of the above list:

$new->item_data['note'] = $this->CI->input->xss_clean(trim(strip_tags($this->get_note($item->get_content()))));


This line defines a new array element 'note' in the item_data array attached to the object $new (which is later built into an $item object). The actual value given to the 'note' element is the result of running $item->get_content() through:

  • get_note()
  • strip_tags()
  • trim() and
  • xss_clean()
These might look confusing now, but they're pretty logical. The get_content() call gets the content of the item (see?). get_note() parses out the note from the content, which then has the HTML tags stripped off (strip_tags()) and the whitespace trimmed off the ends (trim()). The result is then sanitized with xss_clean(), which is a function designed to prevent various scripting attacks.

Anyway, onward! Everything we need is already built into SweetCron except for get_note(). We can put this anywhere in sweetcron.php that is not inside another function. I put mine right after function get_image($html) (line 185, now) if you want to emulate me!

The get_note function should look something like this:

function get_note($html) {
    if (stripos($html, '<blockquote>Shared by') !== false) {
      $pieces = explode("</blockquote>", $html, 2);
      unset($html);
      if(is_array($pieces) && !empty($pieces)) {
        return $pieces[0];
      } else {
        return false;
      }
    } else {
      return false;
    }
}


If you copy-paste this, you might need to replace all of the less-than and greater-than signs, since I used the HTML non-functional versions.

The function gives the name $html to the item_content. Then, stripos() is used to test $html for the string "<blockquote>Shared by". It's case insensitive, and will return a value if it finds the string or false if it does not. If it does find it, we enter the loop.

The variable $pieces is created, and the results of explode() are poured into it. This will split $html into two pieces at the </blockquote>. $html is cleared so we don't forget to get rid of it. If $pieces actually does contain two parts, the method returns to first part, which is the note. Yay!

Save sweetcron.php and put it back where it belongs. We're done with it.

The next issue is to get the note out of the remaining content. We can do that in a plugin. Navigate to /system/application/plugins and open google_com.php. If that doesn't exist, go ahead and open your favorite text editor and save the blank page as google_com.php. We'll work from there.

Here is the complete code of my plugin:

<?php if (!defined('BASEPATH')) exit('No direct script access allowed');

class Google_com {

    function pre_db($item, $original)
    {
    $original_publisher = $original->get_permalink();
      if(!empty($item->item_data['note'])){
        $lookin = $item->item_content;
        $find = $item->item_data['note'];
        $pos = strpos($lookin, $find);

        if ($pos !== false){
          $item->item_content = str_replace($find, '', $lookin);
        }
        //Set length of "Shared by Name" thing + 1 as the final value in the next line.
        $item->item_data['note'] = substr($item->item_data['note'],18);
      }
      return $item;
    }

    function pre_display($item)
    {
      return $item;
    }


}
?>



Again, watch the less-than and greater-than signs.

I'm going to skip the beginning, because it's boilerplate plugin stuff.

If the item_data['note'] value is not empty, the content is searched for the note. If the note is found, the note in the content is removed by str_replace().

The "Shared by" line is then removed from the note. This is something you'll have to edit.



Important!

Write or type out the phrase Shared by Name, where Name is what Google displays when you share something, e.g. Shared by Michael for me. Count the number of characters, including spaces (17 for my phrase). Add one to this value.

Replace the number 18 in the substr() arguments with the value you just calculated. This will tell substr() how many characters to strip out. For some reason there are two phantom characters in the "Shared by" line that I can't identify. Adding 1 to the number of characters will take care of that.


Save the file, and put it in the /system/application/plugins directory. The note is removed and separated!

Manipulating the note can be done in the _activity_feed of your theme in the same way that the item_content or item_title can. Just use <?php echo $item->item_data['note']?> to summon the note object.



Step-by-step Recap
  1. Add the note element to the list of array elements in /system/application/libraries/sweetcron.php.
  2. Add the get_note() function to sweetcron.php.
  3. Save sweetcron.php.
  4. In /system/application/plugins, create or edit google_com.php to include this stuff.
  5. Edit google_com.php as per this important note.
  6. Save google_com.php.
  7. Call your note in the _activity_feed.php file for your theme using <?php echo $item->item_data['note']?> in google.com items.


If you're having trouble, you can try simply downloading the modified files from my LifeStream. You'll still have to edit google_com.php to reflect the length of your name.

By my hand,
~Michael Akerman

0 comments: