A question was asked today on the Professional PHP Developer’s Google group related to SimpleXML and using it to read feeds. From time to time developers have a need to consume web service feeds, like RSS feeds, and use them in some capacity. Whether it be reading for display or for aggregation, the basic concept is A) get the feed source; B) parse it; and C) do what you need to with it.
The SimpleXML PHP extension provides a very easy to use way to do all of this.
First things first
The first thing we are going to need is a feed source. In today’s question, that source is the BBC News. The feed can be found at http://newsapi.bbc.co.uk/feeds/search/news+sport/test and a quick visit in your web browser should show you the contents of the feed we will be parsing.
But what fun would it be to just look at another application’s rendering of our feed. We should have a look using something we build, right?
Let’s get an idea of how the feed is structured first.
QUICK NOTE HERE: If you are not quite yet familiar with Object Oriented Programming this may seem a bit odd to you. Since this is a beginner web site I am not at all going to go into the concepts of OOP here. Just know that some of the stuff we are going to be doing here are going to be OOP related. Don’t get scared, I’ll be right by your side.
![]()
Feed me
Ok, let’s get our hands dirty on some code, shall we? Just to get a feel for what we are going to be working, let’s have a look at the feed source as a SimpleXML object by taking the following code and running it:
<?php /** * Set the content type header to be plain jane text so we don't have to riddle * our output with "pre" tags */ header('Content-Type: text/plain'); /** * Load up our source feed as a SimpleXML object */ $xml = simplexml_load_file('http://newsapi.bbc.co.uk/feeds/search/news+sport/test'); /** * Let's look at it now, so we can see the structure of the feed */ print_r($xml);
I have clipped it for length but the basics of the output of this, as I run it (because feeds are time based, so your output will more than likely be different than mine) is [View the current output of this script]:
SimpleXMLElement Object
(
[@attributes] => Array
(
[version] => 2.0
)
[channel] => SimpleXMLElement Object
(
[ttl] => 15
[title] => BBC News and Sport Search: test
[link] => http://newsapi.bbc.co.uk/feeds/search/news+sport/test
[description] => BBC News and Sport Search: test
[language] => en-gb
[lastBuildDate] => Sun, 10 Jan 2010 17:26:19 GMT
[item] => Array
(
[0] => SimpleXMLElement Object
(
[title] => Battle to beat freeze continues
[link] => http://news.bbc.co.uk/go/rss/news/int/search/news%2Bsport/test/-/2/hi/uk_news/8450690.stm
[guid] => http://news.bbc.co.uk/2/hi/uk_news/8450690.stm
[pubDate] => Sun, 10 Jan 2010 17:18:24 GMT
[description] => The government pledges to do all it can to keep roads and schools open as the severe wintry weather eases its grip on the UK.
)
[1] => SimpleXMLElement Object
(
[title] => Gers sweat on key duo's fitness
[link] => http://news.bbc.co.uk/go/rss/news/int/search/news%2Bsport/test/-/sport2/hi/football/teams/r/rangers/8450831.stm
[guid] => http://news.bbc.co.uk/sport2/hi/football/teams/r/rangers/8450831.stm
[pubDate] => Sun, 10 Jan 2010 16:47:04 GMT
[description] => Rangers assistant manager Ally McCoist is hoping Steven Davis and Kris Boyd will be able to shrug off injuries sustained in the 3-3 draw at Hamilton.
)
)
)
)
Ok, settle down. Let’s not freak out over all that data. Looking at the structure we can see that the root element has a property called “channel” that the feed data lives in. The “channel” property has a collection of properties but the ones we are going to be most interested in are “title”, “lastBuildDate” and “item”.
Looking at these properties we can guess fairly well what they do. Title is the feed title, lastBuildDate is the last time to feed was built and item is going to be the collection of feed entries.
The item collection likewise has a collection of properties in each item, and the properties we are going to be most interested in are “title”, “link”, “pubDate” and “description”. Again, looking at these properties you can intuitively guess what each one does: title is the item title, link is the URL of the item, pubDate is the date the item was published and description is the text of the item.
Putting it all together
Knowing all of this we can now put together a very simple page that will consume this feed and render it. To do that, run this script in your browser [View the output of this script]:
<?php /** * Get the data source feed as a SimpleXML object */ $data = simplexml_load_file('http://newsapi.bbc.co.uk/feeds/search/news+sport/test'); ?> <html> <head> <title>BBC News Sports Feed</title> <style type="text/css"> .feed-item { border: solid 1px #008; margin: 1em 0; } .row { background-color: #fff; } .row-on { background-color: #dedede; } </style> </head> <body> <?php /* Make a title for this page taken from the feed title */ ?> <h1><?php echo $data->channel->title ?></h1> <?php /* For user experience, let them know the last time this feed updated */ ?> <p><small>Last updated <?php echo $data->channel->lastBuildDate ?></small></p> <?php /** * Now loop through the channel items to get the item information. Because * SimpleXML implements the countable and iterator interfaces we can loop * over just about any SimpleXML element recursively. * * For styling, we are going to alternate row color */ // Set the row coloring flag $color = true; // Loop the feed items foreach ($data->channel->item as $item): ?> <?php /* Handle row color switching */ ?> <div class="feed-item row<?php if ($color): ?>-on<?php endif; ?>"> <?php /* Make a title for this item, linking back to its original source*/ ?> <h2><a href="<?php echo $item->link ?>"><?php echo $item->title ?></a></h2> <?php /* Show the body of the item with the publication date */ ?> <p><?php echo $item->description ?></p> <p><strong><?php echo $item->pubDate ?></strong></p> </div> <?php /* Reset the row colorizer as needed */ ?> <?php $color = !$color; endforeach; ?> </body> </html>
And there you have it, a nice little parsed output of an RSS feed using SimpleXML. See, that wasn’t so bad, was it?