So, I recently wrote a quick PHP script that parses the iTunes store RSS and turns it into some webpages that are easily renderable on a mobile device (see last blog post).
I've written quite a number of RSS parsers in PHP before, it's fairly simple and you just make use of the SimpleXML features built into PHP.
However, this time whenever I opened the feed it returned a whole load of rubbish characters. The feed was read fine in Firefox, Safari etc, but when I opened it with my XML editor (Oxygen) or PHP it was just trash! The root of the problem was in this method:
file_get_contents($rssurl);
(or fopen if that was used instead).
combined with the fact that gzip compression (found by looking at the HTTP headers in Firebug). The solution is to replace either of these methods which can't handle the gzip compression with:
gzopen($rssurl,'r');
and use the gzread method to get the data!
Then you can carry on with the parsing as you wish...