In Java and HtmlUnit, how to wait for a resulting page to finish loading and download it as HTML?

MLQ picture MLQ · Jul 5, 2012 · Viewed 23.4k times · Source

HtmlUnit is an awesome Java library that allows you to programatically fill out and submit web forms. I'm currently maintaining a pretty old system written in ASP, and instead of manually filling out this one web form on a monthly basis as I'm required, I'm trying to find a way to maybe automate the entire task because I keep forgetting about it. It's a form for retrieving data gathered within a month. Here's what I've coded so far:

WebClient client = new WebClient();
HtmlPage page = client.getPage("http://urlOfTheWebsite.com/search.aspx");

HtmlForm form = page.getFormByName("aspnetForm");       
HtmlSelect frMonth = form.getSelectByName("ctl00$cphContent$ddlStartMonth");
HtmlSelect frDay = form.getSelectByName("ctl00$cphContent$ddlStartDay");
HtmlSelect frYear = form.getSelectByName("ctl00$cphContent$ddlStartYear");
HtmlSelect toMonth = form.getSelectByName("ctl00$cphContent$ddlEndMonth");
HtmlSelect toDay = form.getSelectByName("ctl00$cphContent$ddlEndDay");
HtmlSelect toYear = form.getSelectByName("ctl00$cphContent$ddlEndYear");
HtmlCheckBoxInput games = form.getInputByName("ctl00$cphContent$chkListLottoGame$0");
HtmlSubmitInput submit = form.getInputByName("ctl00$cphContent$btnSearch");

frMonth.setSelectedAttribute("1", true);
frDay.setSelectedAttribute("1", true);
frYear.setSelectedAttribute("2012", true);
toMonth.setSelectedAttribute("1", true);
toDay.setSelectedAttribute("31", true);
toYear.setSelectedAttribute("2012", true);
games.setChecked(true);
submit.click();

After the click(), I'm supposed to wait for the very same web page to finish reloading because somewhere there is a table that displays the results of my search. Then, when the page is done loading, I need to download it as an HTML file (very much like "Save Page As..." in your favorite browser) because I will scrape out the data to compute their totals, and I've already done that using the Jsoup library.

My questions are: 1. How do I programatically wait for the web page to finish loading in HtmlUnit? 2. How do I programatically download the resulting web page as an HTML file?

I've looked into the HtmlUnit docs already and couldn't find a class that'll do what I need.

Answer

UVM picture UVM · Jul 5, 2012

Try with these settings:

webClient.waitForBackgroundJavaScript() or

webClient.waitForBackgroundJavaScriptStartingBefore()

I think you need to mention the browser as well.By default it is using IE.You will get more info from here. HTMLUnit doesn't wait for Javascript