I am using the Jsoup library to read a URL. This url has text within a few <script>
tags. Is it possible for me to obtain the text within each <script>
tag? Please note that I am not asking to parse a Javascript file as I am already aware JSoup does not allow that. The actual source code of the URL has text within a script tag, I need that.
doc = Jsoup.connect("http://www.example.com").timeout(10000).get();
Element div = doc.select("script").first();
for (Element element : div.children()) {
System.out.println(element.toString());
}
This is what one of the script tags look like from the source code:
<script type="text/javascript">
(function() {
...
})();
</script>
Alternatively, you could use the Element#html()
method that returns the inner html of an element.
Since 1.11.1: Use efficient Element#selectFirst()
method to find the script element.
Document doc = Jsoup.connect("http://www.example.com").timeout(10000).get(); Element scriptElement = doc.selectFirst("script"); // Don't forget to check scriptElement is not null... String jsCode = scriptElement.html();
Up to Jsoup 1.10.3: Combine Element#select()
and Elements#first()
calls to find the script element.
Document doc = Jsoup.connect("http://www.example.com").timeout(10000).get(); Element scriptElement = doc.select("script").first(); // Don't forget to check scriptElement is not null... String jsCode = scriptElement.html();