I have seen several threads on StackOverflow concerning this topic, however none of them seem to provide an answer.
I have a button that, when clicked, opens up an invisible web page, navigates to a URL, enters information into a box, presses a button, and then scrapes the screen for information.
The bones of my code basically in the click:
WebBrowser wb = new WebBrowser;
wb.Visibility = System.Windows.Visibility.Hidden;
wb.Navigate("http://somepage.com");
And this is where it gets tricky.
I am looking for a way to ensure that the page is loaded before trying to enter data or scrape the screen. I have seen several threads that talk about Navigated
, IsLoaded
, LoadCompleted
as well as BackgroundWork
stuff, but I cannot get any of these to work.
Which is the best option to use to determine that the page has fully loaded? How would you get the chosen method to work?
I also cannot get the data from the screen as WPF does not use the same GetElementByID
.
Edit:
Per the comment below, here are the errors I run into:
IsLoaded
never returns true
private void GetData_Click(object sender, RoutedEventArgs e)
{
int x=0;
HTMLDocument doc;
wb = new WebBrowser();
wb.Visibility = System.Windows.Visibility.Visible;
wb.Navigate("somesite.com");
doc = wb.Document as mshtml.HTMLDocument;
while(!wb.IsLoaded)
{
//Wait
}
doc.getElementById("txt_One").innerText = "It Worked";
}
Puts it in an infinite loop as wb
does not ever seem to load.
The event 'System.Windows.Controls.WebBrowser.LoadCompleted' can only appear on the left hand side of += or -=
private void GetData_Click(object sender, RoutedEventArgs e)
{
int x=0;
HTMLDocument doc;
wb = new WebBrowser();
wb.Visibility = System.Windows.Visibility.Visible;
wb.Navigate("somesite.com");
doc = wb.Document as mshtml.HTMLDocument;
wb.LoadCompleted += wb_LoadCompleted;
doc.getElementById("txt_One").innerText = "It Worked";
}
void wb_LoadCompleted(object sender, NavigationEventArgs e)
{
}
Produces the error
An unhandled exception of type 'System.NullReferenceException' occured in {filename}
Additional information: Object reference not set to an instance of an object.
The webbrowser control has a loadedevent (which you have): LoadCompleted: fires when the dom is fully loaded.
Bind the event and in the event method get the document instead of right away.
//root is a grid element identified in the XAML
public WebBrowser webb;
public MainWindow()
{
InitializeComponent();
webb = new WebBrowser();
webb.Visibility = System.Windows.Visibility.Hidden;
root.Children.Add(webb);
webb.LoadCompleted += webb_LoadCompleted;
webb.Navigate("http://www.google.com");
}
void webb_LoadCompleted(object sender, NavigationEventArgs e)
{
MessageBox.Show("Completed loading the page");
mshtml.HTMLDocument doc = webb.Document as mshtml.HTMLDocument;
mshtml.HTMLInputElement obj = doc.getElementById("gs_taif0") as mshtml.HTMLInputElement;
mshtml.HTMLFormElement form = doc.forms.item(Type.Missing, 0) as mshtml.HTMLFormElement;
webb.LoadCompleted -= webb_LoadCompleted; //REMOVE THE OLD EVENT METHOD BINDING
webb.LoadCompleted += webb_LoadCompleted2; //BIND TO A NEW METHOD FOR THE EVENT
obj.value = "test search";
form.submit(); //PERFORM THE POST ON THE FORM OR SEARCH
}
//SECOND EVENT TO FIRE AFTER YOU POST INFORMATION
void webb_LoadCompleted2(object sender, NavigationEventArgs e)
{
MessageBox.Show("Completed loading the page second time after post");
}
You need to do doc = wb.Document as mshtml.HTMLDocument; in the loadcompleted event. Because until the load is complete you cannot get the document.