Recently my friend Dan has been playing with some design ideas for his new site. He had a list of words that could be dragged like refrigerator magnets. He wasn’t happy with the fact that the words were static and to update them was kinda a pain.
Together we came up with the idea to pull in his RSS feed from his blog and use the words from his descriptions. With his design, we wanted 48 words. We didn’t want any punctuation, numbers or html entities floating around either.
So I wrote the following script for him.
<?php /* Include magpierss. -same library wordpress uses for feeds. -we don't use the same code as wordpress because of extra code wordpress uses. */ require_once('magpierss/rss_fetch.inc'); $url = 'http://danielmiller.wordpress.com/feed'; // url of the rss feed we want to process $rss = fetch_rss($url); // grab the rss feed $current = 0; // current item in the feed $words = array(); // word array // while we have less then 48 words and there are still items in the rss feed process while( (count($words) < 48) && ($current < count($rss->items)) ){ $description = $rss->items[$current]['description']; // grab the description $grab = preg_split("/\W/", $description); // seperate by spaces // with the list of words from the current item in the feed we do the following foreach($grab as $word){ $word = preg_replace('/(,|\.|\!|\?|\n)+?/','',$word); // remove punctuation $word = preg_replace('/&.{2,5};/','',$word); // remove html entities $word = preg_replace('/[0-9]+?/','',$word); // remove numbers $word = trim($word); // remove any extra whitespace // only if the word is between 3-10 characters long and is not already in // the word list. if it meets all that we add it. if((strlen($word) > 2) && (strlen($word) < 11) && !in_array($word,$words)){ $words[] = strtolower($word); // word added } } $current++; // increment our item count in the feed } shuffle($words); // randomize our word list. // since the technique above is optimize to bring back the words as quick as possible // we could have more then 48 words. $row = 1; $col = 0; // used to establish our position during output for($i = 0; $i < 48; $i++){ // get the first 48 wordss // if we have cycled through 6 items need to increment row count // reset column count as well if($col > 5){ $col = 0; $row++;} // output html echo '<li class="row'.$row.'">'.$words[$i].'</li>'; $col++; // increment column count } ?>
December 13th, 2008 at 8:52 pm
[...] Nathan’s post on the website [...]
December 13th, 2008 at 8:54 pm
It’s so cool, thanks for your help!
December 15th, 2008 at 1:01 pm
I’m doing something similar on my site, http://www.iancollins.me, which is all built from RSS/JSON feeds dynamically. The difference with mine is I’m only pulling out words that start with a capital letter. It basically ends-up a cheap-and-dirty summarization since it (mostly) pulls out key-terms.