Logging Into A Website Using C# Programmatically

DGarrett01 picture DGarrett01 · Sep 10, 2014 · Viewed 43.4k times · Source

So, I've been scouring the web trying to learn more about how to log into websites programmatically using C#. I don't want to use a web client. I think I want to use something like HttpWebRequest and HttpWebResponse, but I have no idea how these classes work.

I guess I'm looking for someone to explain how they work and the steps required to successfully log in to, say, WordPress, an email account, or any site that requires that you fill in a form with a username and password.

Here's one of my attempts:

// Declare variables
        string url = textBoxGetSource.Text;
        string username = textBoxUsername.Text;
        string password = PasswordBoxPassword.Password;

        // Values for site login fields - username and password html ID's
        string loginUsernameID = textBoxUsernameID.Text;
        string loginPasswordID = textBoxPasswordID.Text;
        string loginSubmitID = textBoxSubmitID.Text;

        // Connection parameters
        string method = "POST";
        string contentType = @"application/x-www-form-urlencoded";
        string loginString = loginUsernameID + "=" + username + "&" + loginPasswordID + "=" + password + "&" + loginSubmitID;
        CookieContainer cookieJar = new CookieContainer();
        HttpWebRequest request;

        request = (HttpWebRequest)WebRequest.Create(url);
        request.CookieContainer = cookieJar;
        request.Method = method;
        request.ContentType = contentType;
        request.KeepAlive = true;
        using (Stream requestStream = request.GetRequestStream())
        using (StreamWriter writer = new StreamWriter(requestStream))
        {
            writer.Write(loginString, username, password);
        }

        using (var responseStream = request.GetResponse().GetResponseStream())
        using (var reader = new StreamReader(responseStream))
        {
            var result = reader.ReadToEnd();
            Console.WriteLine(result);
            richTextBoxSource.AppendText(result);
        }

        MessageBox.Show("Successfully logged in.");

I don't know if I'm on the right track or not. I end up being returned back to the login screen of whatever site I try. I've downloaded Fiddler and was able to glean a little bit of information about what information is sent to the server, but I feel completely lost. If anyone could shed some light here, I would greatly appreciate it.

Answer

xavier picture xavier · Sep 10, 2014

Logging into websites programatically is difficult and tightly coupled with how the site implements its login procedure. The reason your code isn't working is because you aren't dealing with any of this in your requests/responses.

Let's take fif.com for example. When you type in a username and password, the following post request gets sent:

POST https://fif.com/login?task=user.login HTTP/1.1
Host: fif.com
Connection: keep-alive
Content-Length: 114
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Origin: https://fif.com
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.103 Safari/537.36
Content-Type: application/x-www-form-urlencoded
Referer: https://fif.com/login?return=...==
Accept-Encoding: gzip,deflate
Accept-Language: en-US,en;q=0.8
Cookie: 34f8f7f621b2b411508c0fd39b2adbb2=gnsbq7hcm3c02aa4sb11h5c87f171mh3; __utma=175527093.69718440.1410315941.1410315941.1410315941.1; __utmb=175527093.12.10.1410315941; __utmc=175527093; __utmz=175527093.1410315941.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmv=175527093.|1=RegisteredUsers=Yes=1

username=...&password=...&return=aHR0cHM6Ly9maWYuY29tLw%3D%3D&9a9bd5b68a7a9e5c3b06ccd9b946ebf9=1

Notice the cookies (especially the first, your session token). Notice the cryptic url-encoded return value being sent. If the server notices these are missing, it won't let you login.

HTTP/1.1 400 Bad Request

Or worse, a 200 response of a login page with an error message buried somewhere inside.

But let's just pretend you were able to collect all of those magic values and pass them in an HttpWebRequest object. The site wouldn't know the difference. And it might respond with something like this.

HTTP/1.1 303 See other
Server: nginx
Date: Wed, 10 Sep 2014 02:29:09 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Location: https://fif.com/

Hope you were expecting that. But if you've made it this far, you can now programatically fire off requests to the server with your now validated session token and get the expected HTML back.

GET https://fif.com/ HTTP/1.1
Host: fif.com
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.103 Safari/537.36
Referer: https://fif.com/login?return=aHR0cHM6Ly9maWYuY29tLw==
Accept-Encoding: gzip,deflate
Accept-Language: en-US,en;q=0.8
Cookie: 34f8f7f621b2b411508c0fd39b2adbb2=gnsbq7hcm3c02aa4sb11h5c87f171mh3; __utma=175527093.69718440.1410315941.1410315941.1410315941.1; __utmb=175527093.12.10.1410315941; __utmc=175527093; __utmz=175527093.1410315941.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmv=175527093.|1=RegisteredUsers=Yes=1

And this is all for fif.com - this juggling of cookies and tokens and redirects will be completely different for another site. In my experience (with that site in particular), you have three options to get through the login wall.

  1. Write an incredibly complicated and fragile script to dance around the site's procedures
  2. Manually log into the site with your browser, grab the magic values, and plug them into your request objects or
  3. Create a script to automate selenium to do this for you.

Selenium can handle all the juggling, and at the end you can pull the cookies out and fire off your requests normally. Here's an example for fif:

//Run selenium
ChromeDriver cd = new ChromeDriver(@"chromedriver_win32");
cd.Url = @"https://fif.com/login";
cd.Navigate();
IWebElement e = cd.FindElementById("username");
e.SendKeys("...");
e = cd.FindElementById("password");
e.SendKeys("...");
e = cd.FindElementByXPath(@"//*[@id=""main""]/div/div/div[2]/table/tbody/tr/td[1]/div/form/fieldset/table/tbody/tr[6]/td/button");
e.Click();

CookieContainer cc = new CookieContainer();

//Get the cookies
foreach(OpenQA.Selenium.Cookie c in cd.Manage().Cookies.AllCookies)
{
    string name = c.Name;
    string value = c.Value;
    cc.Add(new System.Net.Cookie(name,value,c.Path,c.Domain));
}

//Fire off the request
HttpWebRequest hwr = (HttpWebRequest) HttpWebRequest.Create("https://fif.com/components/com_fif/tools/capacity/values/");
hwr.CookieContainer = cc;
hwr.Method = "POST";
hwr.ContentType = "application/x-www-form-urlencoded";
StreamWriter swr = new StreamWriter(hwr.GetRequestStream());
swr.Write("feeds=35");
swr.Close();

WebResponse wr = hwr.GetResponse();
string s = new System.IO.StreamReader(wr.GetResponseStream()).ReadToEnd();