Parsing the html meta tag with jsoup library

ivange picture ivange · Jun 2, 2016 · Viewed 10.7k times · Source

Just started exploring the Jsoup library as i will use it for one of my projects. I tried googling but i could not find the exact answer that can help me. Here is my problem, i have an html file with meta tags like below

<meta content="this is the title value" name="d.title">
<meta content="this is the description value" name="d.description">
<meta content="language is french" name="d.language">

And a java pojo like so,

public class Example {
    private String title;
    private String description;
    private String language;

    public Example() {}

    // setters and getters go here
} 

Now i want to parse the html file and extract the d.title content value and store in Example.title and d.description value of "content" and store in Example.description and so on and so forth.

What i have done by reading jsoup cookbook is somethink like,

Document doc = Jsoup.parse("test.html");
Elements metaTags = doc.getElementsByTag("meta");

for (Element metaTag : metaTags) {
    String content = metaTag.attr("content");
    String content = metaTag.attr("name");
}

what that will do is walk through all meta tags get the value of their "content" and "name" attributes, but what i want is to get the value of "content" attribute whose "name" attribute is say "d.title" so that i can store it in Example.title

Update: @P.J.Meisch answer below actually sovles the problem but that is too much code for my liking(was trying to avoid doing the exact same thing). I mean i was thinking it could be possible to do something like

String title = metaTags.getContent("d.title")

where d.title is the value of the "name" attribute That way it will reduce the lines of code, i have not found such a method but maybe that is because am still new to jsoup thats why i asked. But if such a method does not exist(which would be nice if it did cuz it makes life easier) i would just go with P.J.Meisch said.

Answer

P.J.Meisch picture P.J.Meisch · Jun 2, 2016

ok, all the code:

Document doc = Jsoup.parse("test.html");
Elements metaTags = doc.getElementsByTag("meta");

Example ex = new Example();

for (Element metaTag : metaTags) {
  String content = metaTag.attr("content");
  String name = metaTag.attr("name");

  if("d.title".equals(name)) {
    ex.setTitle(content);
  }
  if("d.description".equals(name)) {
    ex.setDescription(content);
  }
  if("d.language".equals(name)) {
    ex.setLanguage(content);
  }
}