How to parse an RSS feed with XmlPullParser?

Rox picture Rox · Jul 2, 2013 · Viewed 7.9k times · Source

I would like to parse a RSS feed. My question is how I can parse all tags between the <item>and </item> tags.

Given this very simple XML:

<?xml version="1.0" ?>
<rss version="2.0">
<channel>
  <title>MyRSSPage</title>
  <link>http://www.example.com</link>
  <item>
  <link>www.example.com/example1</link>
  <title>Example title 1</title>
  </item>
  <item>
  <link>www.example.com/example2</link>
  <title>Example title 2</title>
  </item>
</channel>
</rss>

I would like to parse just the stuff between the <item>...</item> tags.

            List<RssMessage> messages = new ArrayList<RssMessage>();

            // parser is a XmlPullParser instance
            while(parser.next() != XmlPullParser.END_DOCUMENT) {
                if (parser.getEventType() != XmlPullParser.START_TAG) {
                    continue;
                }
            String name = parser.getName();
            // START OF HEADER
            if(name.equals("title")) {
                title = parser.nextText();
            }
            else if(name.equals("link")) {
                link = parser.nextText();
            }
            else if(name.equals("description")) {
                description = parser.nextText();
            }
            else if(name.equals("language")) {
                language = parser.nextText();
            }
            else if(name.equals("copyright")) {
                copyright = parser.nextText();
            }
            else if(name.equals("pubDate")) {
                pubdate = parser.nextText();
            }
            // END OF HEADER

            else if(name.equals("item")) {
                RssMessage rssMessage = processItem(parser);
                messages.add(rssMessage);
            }
        }

In the below method I would like to just parse the tags within the <item>...</item>tags. How do I construct a loop that just goes through the item between <item> and </item>?

EDIT
This is almost working. But sometimes not all elements are initiated even if the corresponding element in the RSS xml DO exist! Is something wrong with the below code?

private RssMessage processItem(XmlPullParser parser) throws IOException, XmlPullParserException {
        RssMessage rssMessage = new RssMessage();
    parser.require(XmlPullParser.START_TAG, ns, "item");
    while (parser.next() != XmlPullParser.END_TAG) {
        if (parser.getEventType() != XmlPullParser.START_TAG) {
            continue;
        }
        String name = parser.getName();
        if(name.equals("link")) {
            rssMessage.setLink(parser.nextText());
        }
        else if(name.equals("guid")) {
            rssMessage.setGuid(parser.nextText());
        }
        else if(name.equals("category")) {
            rssMessage.setCategory(parser.nextText());
        }
        else if(name.equals("title")) {
            rssMessage.setTitle(parser.nextText());
        }
        else if(name.equals("pubDate")) {
            rssMessage.setPubDate(parser.nextText());
        }
    }
    return rssMessage;
    }

Answer

Raghunandan picture Raghunandan · Jul 2, 2013

Try the below.

try {
    XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
    factory.setNamespaceAware(false);
    XmlPullParser xpp = factory.newPullParser();
    xpp.setInput(url.openConnection().getInputStream(), "UTF_8"); 
    //xpp.setInput(getInputStream(url), "UTF-8");

    boolean insideItem = false;

    // Returns the type of current event: START_TAG, END_TAG, etc..
    int eventType = xpp.getEventType();
    while (eventType != XmlPullParser.END_DOCUMENT) {
        if (eventType == XmlPullParser.START_TAG) {

            if (xpp.getName().equalsIgnoreCase("item")) {
                insideItem = true;
            } 
            else if(xpp.getName().equalsIgnoreCase("title")) 
            {

            }
        }
        eventType = xpp.next(); //move to next element
    }

} catch (MalformedURLException e) {
    e.printStackTrace();
} catch (XmlPullParserException e) {
    e.printStackTrace();
} catch (IOException e) {
    e.printStackTrace();
}

Edit:

XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
factory.setNamespaceAware(false);
XmlPullParser xpp = factory.newPullParser();
xpp.setInput(open,null);
// xpp.setInput(getInputStream(url), "UTF-8");

boolean insideItem = false;

// Returns the type of current event: START_TAG, END_TAG, etc..
int eventType = xpp.getEventType();
while (eventType != XmlPullParser.END_DOCUMENT) {
    if (eventType == XmlPullParser.START_TAG) {

        if (xpp.getName().equalsIgnoreCase("item")) {
            insideItem = true;
        } else if (xpp.getName().equalsIgnoreCase("title")) {
            if (insideItem)
                Log.i("....",xpp.nextText()); // extract the headline
        } else if (xpp.getName().equalsIgnoreCase("link")) {
            if (insideItem)
                Log.i("....",xpp.nextText());  // extract the link of article
        }
    } else if (eventType == XmlPullParser.END_TAG && xpp.getName().equalsIgnoreCase("item")) {
        insideItem = false;
    }

    eventType = xpp.next(); // move to next element
}

Output

www.example.com/example1
Example title 1
www.example.com/example2
Example title 2