I need help reading the contents of a webpage. Currently i am using the following method to read the contents
BufferedReader in = new BufferedReader(new InputStreamReader(page.openStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
{Content = Content + inputLine;}
However with this method there is a problem. . some jsp pages have ajax in them which randomly updates a css class of a webpage like so Javascript code just to give an idea:
if (request.readyState === 4 && request.status === 200)
{
var type = request.getResponseHeader("Content-Type");
$('.update').empty();
$('.update').append(request.responseText); //update the css class
}
So as a result when this page reader is read through my java method as mentioned above i just get
<div class="update"></div>
although on the screen this class has a value. Now however if i save the page first (by clicking save as in Firefox) then the values appended in the CSS class by jquery are also visible. Is there a method or a way on how i could read the values or obtain the values like firefox does by saving the pages.. I want to read the contents of the entire webpage with the Ajax values present in the string.
On one side i read that this is difficult since the JAvascript in rendered and executed by the browser so i wanted to know does firefox have any apis that might help ? Any suggestions would be appreciated.
You may find the following project useful:
Here is also a very informative blog post from Data Big Bang.