Let's say I want to read a line from a socket, using the standard socket
module:
def read_line(s):
ret = ''
while True:
c = s.recv(1)
if c == '\n' or c == '':
break
else:
ret += c
return ret
What exactly happens in s.recv(1)
? Will it issue a system call each time? I guess I should add some buffering, anyway:
For best match with hardware and network realities, the value of bufsize should be a relatively small power of 2, for example, 4096.
http://docs.python.org/library/socket.html#socket.socket.recv
But it doesn't seem easy to write efficient and thread-safe buffering. What if I use file.readline()
?
# does this work well, is it efficiently buffered?
s.makefile().readline()
If you are concerned with performance and control the socket completely (you are not passing it into a library for example) then try implementing your own buffering in Python -- Python string.find and string.split and such can be amazingly fast.
def linesplit(socket):
buffer = socket.recv(4096)
buffering = True
while buffering:
if "\n" in buffer:
(line, buffer) = buffer.split("\n", 1)
yield line + "\n"
else:
more = socket.recv(4096)
if not more:
buffering = False
else:
buffer += more
if buffer:
yield buffer
If you expect the payload to consist of lines that are not too huge, that should run pretty fast, and avoid jumping through too many layers of function calls unnecessarily. I'd be interesting in knowing how this compares to file.readline() or using socket.recv(1).