I am trying to learn about socket programming as well as the WebSocket protocol. I know that there are python web socket clients in existence but I am hoping to just build a toy version for my own learning. To do this I have created an extremely simple Tornado websocket server that I am running on localhost:8888
. All it does is print a message when a client connects.
This is the entire server - and it works (I have tested it with a small javascript script in my browser)
import tornado.httpserver
import tornado.websocket
import tornado.ioloop
import tornado.web
class WSHandler(tornado.websocket.WebSocketHandler):
def open(self):
print('new connection')
self.write_message("Hello World")
def on_message(self, message):
print('message received %s' % message)
def on_close(self):
print('connection closed')
application = tornado.web.Application([
(r'/ws', WSHandler),
])
if __name__ == "__main__":
http_server = tornado.httpserver.HTTPServer(application)
http_server.listen(8888)
tornado.ioloop.IOLoop.instance().start()
So once I start up the server I try to run the following script
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((socket.gethostbyname('localhost'), 8888))
msg = '''GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Origin: http://example.com
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13'''.encode('ascii')
print(len(msg))
sent_count = sock.send(msg)
print('sent this many bytes:', sent_count)
recv_value = sock.recv(1)
print('recvieved:', recv_value)
What I am hoping is that the server will send back the response header as specified in the RFC. Instead the sock.recv is hanging. This leads me to believe the server isn't acknowledging the websocket initial handshake. This handshake is pulled off of the RFC as well. I know that the websocket key should be random and everything, but I don't think that would cause the server to ignore the handshake (the websocket key is valid). I think I can figure the rest out once I can initiate the handshake so I am hoping that there is just some misunderstanding in either how websockets work or how to send the initial handhake.
1) When you send a message over a socket, you have no idea how many chunks it will be divided into. It may all get sent at once; or the first 3 letters may be sent, then the rest of the message; or the message may be split into 10 pieces.
2) Given 1) how is the server supposed to know when it has received all the chunks sent by the client? For instance, suppose the sever receives 1 chunk of the client's message. How does the server know whether that was the whole message or whether there are 9 more chunks coming?
3) I suggest you read this:
http://docs.python.org/2/howto/sockets.html
(Plus the links in the comments)
4) Now, why aren't you using python to create an HTTP server?
python3:
import http.server
import socketserver
PORT = 8000
handler = http.server.SimpleHTTPRequestHandler
httpd = socketserver.TCPServer(("", PORT), handler)
print("serving at port", PORT)
httpd.serve_forever()
python2:
import SimpleHTTPServer
import SocketServer
PORT = 8000
handler = SimpleHTTPServer.SimpleHTTPRequestHandler
httpd = SocketServer.TCPServer(("", PORT), handler)
print "serving at port", PORT
httpd.serve_forever()
The SimpleHTTPRequestHandler serves files out of the server program's directory and below, matching the request url to the directory structure you create. If you request '/', the server will serve up an index.html file out of the same directory the server is in. Here is an example of a client socket for python 3 (python 2 example below):
import socket
import sys
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
except socket.error:
print('Failed to create socket')
sys.exit()
print('Socket Created')
#To allow you to immediately reuse the same port after
#killing your server:
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
host = 'localhost';
port = 8000;
s.connect((host , port))
print('Socket Connected to ' + host + ' on port ', port)
#Send some data to server
message = "GET / HTTP/1.1\r\n\r\n"
try :
#Send the whole string(sendall() handles the looping for you)
s.sendall(message.encode('utf8') )
except socket.error:
print('Send failed')
sys.exit()
print('Message sent successfully')
#Now receive data
data = []
while True:
chunk = s.recv(4096) #blocks while waiting for data
if chunk: data.append(chunk.decode("utf8"))
#If the recv() returns a blank string, then the other side
#closed the socket, and no more data will be sent:
else: break
print("".join(data))
--output:--
Socket Created
Socket Connected to localhost on port 8000
Message sent successfully
HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/3.2.3
Date: Sat, 08 Jun 2013 09:15:18 GMT
Content-type: text/html
Content-Length: 23
Last-Modified: Sat, 08 Jun 2013 08:29:01 GMT
<div>hello world</div>
In python 3, you have to use byte strings with sockets, otherwise you will get the dreaded:
TypeError: 'str' does not support the buffer interface
Here it is in python 2.x:
import socket
import sys
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
except socket.error:
print 'Failed to create socket'
sys.exit()
print('Socket Created')
#To allow you to immediately reuse the same port after
#killing your server:
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
host = 'localhost';
port = 8000;
s.connect((host , port))
print('Socket Connected to ' + host + ' on port ', port)
#Send some data to server
message = "GET / HTTP/1.1\r\n\r\n"
try :
#Send the whole string(handles the looping for you)
s.sendall(message)
except socket.error:
print 'Send failed'
sys.exit()
print 'Message sent successfully'
#Now receive data
data = []
while True:
chunk = s.recv(4096) #blocks while waiting for data
if chunk: data.append(chunk)
#If recv() returns a blank string, then the other side
#closed the socket, and no more data will be sent:
else: break
print("".join(data))
--output:--
Message sent successfully
HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/2.7.3
Date: Sat, 08 Jun 2013 10:06:04 GMT
Content-type: text/html
Content-Length: 23
Last-Modified: Sat, 08 Jun 2013 08:29:01 GMT
<div>hello world</div>
Note that the header of the GET requests tells the server that HTTP 1.1 will be the protocol, i.e. the rules governing the conversation. And as the RFC for HTTP 1.1 describes, there has to be two '\r\n' sequences in the request. So the server is looking for that second '\r\n' sequence. If you delete one of the '\r\n' sequences from the request, the client will hang on the recv() because the server is still waiting for more data because the server hasn't read that second '\r\n' sequence.
Also note that you will be sending the data as bytes(in python 3), so there are not going to be any automatic '\n' conversions, and the server will be expecting the sequence '\r\n'.