PHP: Atom Feed Reader: Source CodeBased on the RSS Feed Reader, this is a similar class designed to parse Atom feeds and display them on a webpage as HTML. It's not been tested on all versions of the Atom format, but should be easy enough to customise. Embedding an Atom Feed as HTMLAgain, it's a simple class with a constructor and two public functions: getOutput returns an HTML-formatted version of the Atom feed, while getRawOutput returns all the attributes in a single multi-level array. <?PHP
include "class.myatomparser.php";
# where is the feed located?
$url = "http://www.example.net/atom.xml";
# create object to hold data and display output
$atom_parser = new myAtomParser($url);
$output = $atom_parser->getOutput(); # returns string containing HTML
echo $output;
?>
You can limit the number of entries displayed by passing a number as the first argument to getOutput(), and if the encoding of your webpage doesn't match that of the feed you're subscribing to then you can pass the desired encoding as a second argument to getOutput() (e.g. ISO-8859-1). By default the output of this class will be UTF-8. Source code of class.myatomparser.phpThis class is by no means the be-all and end-all of Atom parsing. It's designed to be simple, functional and easily customisable. Any feedback would be welcome. File: class.myatomparser.php <?PHP
# Original PHP code by Chirp Internet: www.chirp.com.au
# Please acknowledge use of this code by including this header.
class myAtomParser
{
# keeps track of current and preceding elements
var $tags = array();
# array containing all feed data
var $output = array();
# return value for display functions
var $retval = "";
var $encoding = array();
# constructor for new object
function myAtomParser($file)
{
# instantiate xml-parser and assign event handlers
$xml_parser = xml_parser_create("");
xml_set_object($xml_parser, $this);
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "parseData");
# open file for reading and send data to xml-parser
$fp = @fopen($file, "r") or die("<b>myAtomParser Error:</b> Could not open URL $file for input");
while($data = fread($fp, 4096)) {
xml_parse($xml_parser, $data, feof($fp)) or die(
sprintf("myAtomParser: Error <b>%s</b> at line <b>%d</b><br>",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser))
);
}
fclose($fp);
# dismiss xml parser
xml_parser_free($xml_parser);
}
function startElement($parser, $tagname, $attrs)
{
if($this->encoding) {
# content is encoded - so keep elements intact
$tmpdata = "<$tagname";
if($attrs) foreach($attrs as $key => $val) $tmpdata .= " $key=\"$val\"";
$tmpdata .= ">";
$this->parseData($parser, $tmpdata);
} else {
if($attrs['HREF'] && $attrs['REL'] && $attrs['REL'] == 'alternate') {
$this->startElement($parser, 'LINK', array());
$this->parseData($parser, $attrs['HREF']);
$this->endElement($parser, 'LINK');
}
if($attrs['TYPE']) $this->encoding[$tagname] = $attrs['TYPE'];
# check if this element can contain others - list may be edited
if(preg_match("/^(FEED|ENTRY)$/", $tagname)) {
if($this->tags) {
$depth = count($this->tags);
list($parent, $num) = each($tmp = end($this->tags));
if($parent) $this->tags[$depth-1][$parent][$tagname]++;
}
array_push($this->tags, array($tagname => array()));
} else {
# add tag to tags array
array_push($this->tags, $tagname);
}
}
}
function endElement($parser, $tagname)
{
# remove tag from tags array
if($this->encoding) {
if(isset($this->encoding[$tagname])) {
unset($this->encoding[$tagname]);
array_pop($this->tags);
} else {
if(!preg_match("/(BR|IMG)/", $tagname)) $this->parseData($parser, "</$tagname>");
}
} else {
array_pop($this->tags);
}
}
function parseData($parser, $data)
{
# return if data contains no text
if(!trim($data)) return;
$evalcode = "\$this->output";
foreach($this->tags as $tag) {
if(is_array($tag)) {
list($tagname, $indexes) = each($tag);
$evalcode .= "[\"$tagname\"]";
if(${$tagname}) $evalcode .= "[" . (${$tagname} - 1) . "]";
if($indexes) extract($indexes);
} else {
if(preg_match("/^([A-Z]+):([A-Z]+)$/", $tag, $matches)) {
$evalcode .= "[\"$matches[1]\"][\"$matches[2]\"]";
} else {
$evalcode .= "[\"$tag\"]";
}
}
}
if(isset($this->encoding['CONTENT']) && $this->encoding['CONTENT'] == "text/plain") {
$data = "<pre>$data</pre>";
}
eval("$evalcode .= '" . addslashes($data) . "';");
}
# display a single feed as HTML
function display_feed($data, $limit)
{
extract($data);
if($TITLE) {
# display feed information
$this->retval .= "<h1>";
if($LINK) $this->retval .= "<a href=\"$LINK\" target=\"_blank\">";
$this->retval .= stripslashes($TITLE);
if($LINK) $this->retval .= "</a>";
$this->retval .= "</h1>\n";
if($TAGLINE) $this->retval .= "<P>" . stripslashes($TAGLINE) . "</P>\n\n";
$this->retval .= "<div class=\"divider\"><!-- --></div>\n\n";
}
if($ENTRY) {
# display feed entry(s)
foreach($ENTRY as $item) {
$this->display_entry($item, "FEED");
if(is_int($limit) && --$limit <= 0) break;
}
}
}
# display a single entry as HTML
function display_entry($data, $parent)
{
extract($data);
if(!$TITLE) return;
$this->retval .= "<p><b>";
if($LINK) $this->retval .= "<a href=\"$LINK\" target=\"_blank\">";
$this->retval .= stripslashes($TITLE);
if($LINK) $this->retval .= "</a>";
$this->retval .= "</b>";
if($ISSUED) $this->retval .= " <small>($ISSUED)</small>";
$this->retval .= "</p>\n";
if($AUTHOR) {
$this->retval .= "<P><b>Author:</b> " . stripslashes($AUTHOR['NAME']) . "</P>\n\n";
}
if($CONTENT) {
$this->retval .= "<P>" . stripslashes($CONTENT) . "</P>\n\n";
} elseif($SUMMARY) {
$this->retval .= "<P>" . stripslashes($SUMMARY) . "</P>\n\n";
}
}
function fixEncoding($input, $output_encoding)
{
if(!function_exists('mb_detect_encoding')) return $input;
$encoding = mb_detect_encoding($input);
switch($encoding) {
case 'ASCII':
case $output_encoding:
return $input;
case '':
return mb_convert_encoding($input, $output_encoding);
default:
return mb_convert_encoding($input, $output_encoding, $encoding);
}
}
# display entire feed as HTML
function getOutput($limit=false, $output_encoding='UTF-8')
{
$this->retval = "";
$start_tag = key($this->output);
switch($start_tag) {
case "FEED":
foreach($this->output as $feed) $this->display_feed($feed, $limit);
break;
default:
die("Error: unrecognized start tag '$start_tag' in getOutput()");
}
return $this->fixEncoding($this->retval, $output_encoding);
}
# return raw data as array
function getRawOutput($output_encoding='UTF-8')
{
return $this->fixEncoding($this->output, $output_encoding);
}
}
?>
Fields Supported by DefaultThis script supports the following attributes (fields) by default but can easily be extended. See the Feed Reader Demonstration for examples of parsed Atom (and RSS) feeds. Channel (FEED)
Item (ENTRY)
If you think it's worth adding support for other Atom attributes, please let us know using the Feedback link below. Related Pages
Feedback and Questions21 September 2007: Botond Zalai says: This script is great, but in order to use a blogger atom feed in my page I had to do an encoding conversion. Thanks for your feedback Botond. You can see that I've now added some code to handle character encoding differences between the feed source and the page where it's displayed. 9 October 2007: trevor says: This code totally rocks! That sounds a bit complicated to me, but as long as it works 20 December 2007: Hank says: Is there a way to limit the output to say 5 or 6 entries? I am not very good at PHP but I know there has got to be a way to do it... Just replace "getOutput()" with "getOutput(5)" in the example code to limit the output to the first 5 entries. 6 January 2008: Ken says: You say the function getRawOutput() returns an array. What are the keys in the array? The best way to find that out would be to use the print_r function to display all the contents of the array. There's also an example on this page. 25 June 2008: Pascal says: I've tested this PHP class and therefor I'd like to use it in other PHP projects. Therefor I have to know under which license this PHP class is published. Can you please add this information? Thanks! Hi Pascal, this class is available under an open source licence. 17 July 2008: Kevin Creighton says: One quick question: Is it possible for me to remove the "Author" field from the output, and if so, how is that accomplished? Just remove the code block from display_entry() that starts with if($AUTHOR) as that's where the Author is displayed. 18 September 2008: Evan Robinson says: I couldn't have added the twitter feed nearly as easily without this code. I'm afraid that I butchered the output unmercifully in order to get it into mySQL instead of on the page, but the underlying code worked great and was easy to understand and modify. Thanks for a great resource! 28 February 2009: dave says: i am happy to have found this. question though. not sure what i am doing wrong. i get back tons of these: Are you sure your feed contains valid XML? It looks like you it might have some HTML that is not properly encoded. You should try validating your feed as a first step. |
|
|
© Copyright 2009 Chirp Internet
- Page Last Modified: 11 June 2009
|
|