Python HTTPS Proxy Tunnelling

thetree picture thetree · Jun 14, 2014 · Viewed 8.1k times · Source

I'm trying to make an http proxy in python. So far I've got everything except https working, hence the next step is to implement the CONNECT method.

I'm slightly confused with the chain of events that need to occur when doing https tunnelling. From my understanding I should have this when connecting to google:

Broswer -> Proxy

CONNECT www.google.co.uk:443 HTTP/1.1\r\n\r\n

Then the proxy should establish a secure connection to google.co.uk, and confirm it by sending:

Proxy -> Browser

HTTP/1.1 200 Connection established\r\n\r\n

At this point I'd expect the browser to now go ahead with whatever it was going to do in the first place, however, I either get nothing, or get a string of bytes that I can't decode(). I've been reading anything and everything to do with ssl tunnelling, and I think I'm supposed to be forwarding any and all bytes from browser to server, as well as the other way around. However, when doing this, I get a:

HTTP/1.0 400 Bad Request\r\n...\r\n

Once I've sent the 200 code, what should I be doing next?

My code snippet for the connect method:

client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

if headers["Method"] == "CONNECT":
    client = ssl.wrap_socket(client)

    try:
        client.connect(( headers["Host"], headers["Port"] ))
        reply = "HTTP/1.0 200 Connection established\r\n"
        reply += "Proxy-agent: Pyx\r\n"
        reply += "\r\n"
        browser.sendall( reply.encode() )
    except socket.error as err:
        print(err)
        break

    while True:
        now not sure

Help is much appreciated!

Answer

thetree picture thetree · Jun 15, 2014

After finding this answer to a related question: HTTPS Proxy Implementation (SSLStream)

I realised that the initial connection on port 443 of the target server (in this case google.co.uk) should NOT be encrypted. I therefore removed the

client = ssl.wrap_socket(client)

line to continue with a plain text tunnel rather than ssl. Once the

HTTP/1.1 200 Connection established\r\n\r\n

message is sent, the browser and end server will then form their own ssl connection through the proxy, and so the proxy doesn't need to do anything related to the actual https connection.

The modified code (includes byte forwarding):

# If we receive a CONNECT request
if headers["Method"] == "CONNECT":
    # Connect to port 443
    try:
        # If successful, send 200 code response
        client.connect(( headers["Host"], headers["Port"] ))
        reply = "HTTP/1.0 200 Connection established\r\n"
        reply += "Proxy-agent: Pyx\r\n"
        reply += "\r\n"
        browser.sendall( reply.encode() )
    except socket.error as err:
        # If the connection could not be established, exit
        # Should properly handle the exit with http error code here
        print(err)
        break

    # Indiscriminately forward bytes
    browser.setblocking(0)
    client.setblocking(0)
    while True:
        try:
            request = browser.recv(1024)
            client.sendall( request )
        except socket.error as err:
            pass
        try:
            reply = client.recv(1024)
            browser.sendall( reply )
        except socket.error as err:
            pass

References:

HTTPS Proxy Implementation (SSLStream)

http://tools.ietf.org/html/draft-luotonen-ssl-tunneling-03

http://www.ietf.org/rfc/rfc2817.txt