Tuesday, April 12, 2016

Mysterious exceptions when making many concurrent requests from urllib.request to HTTPServer

Leave a Comment

I am trying to do this Matasano crypto challenge that involves doing a timing attack against a server with an artificially slowed-down string comparison function. It says to use "the web framework of your choosing", but I didn't feel like installing a web framework, so I decided to use the HTTPServer class built into the http.server module.

I came up with something that worked, but it was very slow, so I tried to speed it up using the (poorly-documented) thread pool built into multiprocessing.dummy. It was much faster, but I noticed something strange: if I make 8 or fewer requests concurrently, it works fine. If I have more than that, it works for a while and gives me errors at seemingly random times. The errors seem to be inconsistent and not always the same, but they usually have Connection refused, invalid argument, OSError: [Errno 22] Invalid argument, urllib.error.URLError: <urlopen error [Errno 22] Invalid argument>, BrokenPipeError: [Errno 32] Broken pipe, or urllib.error.URLError: <urlopen error [Errno 61] Connection refused> in them.

Is there some limit to the number of connections the server can handle? I don't think the number of threads per se is the problem, because I wrote a simple function that did the slowed-down string comparison without running the web server, and called it with 500 simultaneous threads, and it worked fine. I don't think that simply making requests from that many threads is the problem, because I have made crawlers that used over 100 threads (all making simultaneous requests to the same website) and they worked fine. It looks like maybe the HTTPServer is not meant to reliably host production websites that get large amounts of traffic, but I am surprised that it is this easy to make it crash.

I tried gradually removing stuff from my code that looked unrelated to the problem, as I usually do when I diagnose mysterious bugs like this, but that wasn't very helpful in this case. It seemed like as I was removing seemingly unrelated code, the number of connections that the server could handle gradually increased, but there was not a clear cause of the crashes.

Does anyone know how to increase the number of requests I can make at once, or at least why this is happening?

My code is complicated, but I came up with this simple program that demonstrates the problem:

#!/usr/bin/env python3  import os import random  from http.server import BaseHTTPRequestHandler, HTTPServer from multiprocessing.dummy import Pool as ThreadPool from socketserver import ForkingMixIn, ThreadingMixIn from threading import Thread from time import sleep from urllib.error import HTTPError from urllib.request import urlopen   class FancyHTTPServer(ThreadingMixIn, HTTPServer):     pass   class MyRequestHandler(BaseHTTPRequestHandler):     def do_GET(self):         sleep(random.uniform(0, 2))         self.send_response(200)         self.end_headers()         self.wfile.write(b"foo")      def log_request(self, code=None, size=None):         pass  def request_is_ok(number):     try:         urlopen("http://localhost:31415/test" + str(number))     except HTTPError:         return False     else:         return True   server = FancyHTTPServer(("localhost", 31415), MyRequestHandler) try:     Thread(target=server.serve_forever).start()     with ThreadPool(200) as pool:         for i in range(10):             numbers = [random.randint(0, 99999) for j in range(20000)]             for j, result in enumerate(pool.imap(request_is_ok, numbers)):                 if j % 20 == 0:                     print(i, j) finally:     server.shutdown()     server.server_close()     print("done testing server") 

For some reason, the program above works fine unless it has over 100 threads or so, but my real code for the challenge can only handle 8 threads. If I run it with 9, I usually get connection errors, and with 10, I always get connection errors. I tried using concurrent.futures.ThreadPoolExecutor, concurrent.futures.ProcessPoolExecutor, and multiprocessing.pool instead of multiprocessing.dummy.pool and none of those seemed to help. I tried using a plain HTTPServer object (without the ThreadingMixIn) and that just made things run very slowly and didn't fix the problem. I tried using ForkingMixIn and that didn't fix it either.

What am I supposed to do about this? I am running Python 3.5.1 on a late-2013 MacBook Pro running OS X 10.11.3.

EDIT: I tried a few more things, including running the server in a process instead of a thread, as a simple HTTPServer, with the ForkingMixIn, and with the ThreadingMixIn. None of those helped.

EDIT: This problem is stranger than I thought. I tried making one script with the server, and another with lots of threads making requests, and running them in different tabs in my terminal. The process with the server ran fine, but the one making requests crashed. The exceptions were a mix of ConnectionResetError: [Errno 54] Connection reset by peer, urllib.error.URLError: <urlopen error [Errno 54] Connection reset by peer>, OSError: [Errno 41] Protocol wrong type for socket, urllib.error.URLError: <urlopen error [Errno 41] Protocol wrong type for socket>, urllib.error.URLError: <urlopen error [Errno 22] Invalid argument>.

I tried it with a dummy server like the one above, and if I limited the number of concurrent requests to 5 or fewer, it worked fine, but with 6 requests, the client process crashed. There were some errors from the server, but it kept going. The client crashed regardless of whether I was using threads or processes to make the requests. I then tried putting the slowed-down function in the server and it was able to handle 60 concurrent requests, but it crashed with 70. This seems like it may contradict the evidence that the problem is with the server.

EDIT: I tried most of the things I described using requests instead of urllib.request and ran into similar problems.

EDIT: I am now running OS X 10.11.4 and running into the same problems.

3 Answers

Answers 1

You're using the default listen() backlog value, which is probably the cause of a lot of those errors. This is not the number of simultaneous clients with connection already established, but the number of clients waiting on the listen queue before the connection is established. Change your server class to:

class FancyHTTPServer(ThreadingMixIn, HTTPServer):     def server_activate(self):         self.socket.listen(128) 

128 is a reasonable limit. You might want to check socket.SOMAXCONN or your OS somaxconn if you want to increase it further. If you still have random errors under heavy load, you should check your ulimit settings and increase if needed.

I did that with your example and I got over 1000 threads running fine, so I think that should solve your problem.


Update

If it improved but it's still crashing with 200 simultaneous clients, then I'm pretty sure your main problem was the backlog size. Be aware that your problem is not the number of concurrent clients, but the number of concurrent connection requests. A brief explanation on what that means, without going too deep into TCP internals.

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.bind((HOST, PORT)) s.listen(BACKLOG) while running:     conn, addr = s.accept()     do_something(conn, addr) 

In this example, the socket is now accepting connections on the given port, and the s.accept() call will block until a client connects. You can have many clients trying to connect simultaneously, and depending on your application you might not be able to call s.accept() and dispatch the client connection as fast as the clients are trying to connect. Pending clients are queued, and the max size of that queue is determined by the BACKLOG value. If the queue is full, clients will fail with a Connection Refused error.

Threading doesn't help, because what the ThreadingMixIn class does is to execute the do_something(conn, addr) call in a separate thread, so the server can return to the mainloop and the s.accept() call.

You can try increasing the backlog further, but there will be a point where that won't help because if the queue grows too large some clients will timeout before the server performs the s.accept() call.

So, as I said above, your problem is the number of simultaneous connection attempts, not the number of simultaneous clients. Maybe 128 is enough for your real application, but you're getting an error on your test because you're trying to connect with all 200 threads at once and flooding the queue.

Don't worry about ulimit unless you get a Too many open files error, but if you want to increase the backlog beyond 128, do some research on socket.SOMAXCONN. This is a good start: https://utcc.utoronto.ca/~cks/space/blog/python/AvoidSOMAXCONN

Answers 2

I'd say that your issue is related to some IO blocking since I've successfully executed your code on NodeJs without breaking a sweat. I also noticed that both the server and the client have trouble to work individually.

But it is possible to increase the number of requests with a few modifications:

  • Define the number of concurrent connections for the server and enable the reuse:

    http.server.HTTPServer.request_queue_size = 500 http.server.HTTPServer.allow_reuse_address = True

  • Run the server in a different process:

    server = multiprocessing.Process(target=RunHTTPServer) server.start()

  • Use a connection pool on the client side to execute the requests

  • Use a thread pool on the server side to handle the requests

  • Allow the reuse of the connection on the client side by setting the schema and by using the "keep-alive" header

With all these modifications, I managed to run the code with 500 threads without any issue. So if you want to give it a try, here is the complete code:

import random from time import sleep, clock from http.server import BaseHTTPRequestHandler, HTTPServer from multiprocessing import Process from multiprocessing.pool import ThreadPool from socketserver import ThreadingMixIn from concurrent.futures import ThreadPoolExecutor from urllib3 import HTTPConnectionPool from urllib.error import HTTPError   class HTTPServerThreaded(HTTPServer):     request_queue_size = 500     allow_reuse_address = True      def serve_forever(self):         executor = ThreadPoolExecutor(max_workers=self.request_queue_size)          while True:           try:               request, client_address = self.get_request()               executor.submit(ThreadingMixIn.process_request_thread, self, request, client_address)           except OSError:               break          self.server_close()   class MyRequestHandler(BaseHTTPRequestHandler):     default_request_version = 'HTTP/1.1'      def do_GET(self):         sleep(random.uniform(0, 1) / 100.0)          data = b"abcdef"         self.send_response(200)         self.send_header("Content-type", 'text/html')         self.send_header("Content-length", len(data))         self.end_headers()         self.wfile.write(data)      def log_request(self, code=None, size=None):         pass   def RunHTTPServer():     server = HTTPServerThreaded(('127.0.0.1', 5674), MyRequestHandler)     server.serve_forever()   client_headers = {      'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)',     'Content-Type': 'text/plain',     'Connection': 'keep-alive' }  client_pool = None  def request_is_ok(number):     response = client_pool.request('GET', "/test" + str(number), headers=client_headers)     return response.status == 200 and response.data == b"abcdef"   if __name__ == '__main__':      # start the server in another process     server = Process(target=RunHTTPServer)     server.start()      # start a connection pool for the clients     client_pool = HTTPConnectionPool('127.0.0.1', 5674)      # execute the requests     with ThreadPool(500) as thread_pool:         start = clock()          for i in range(5):             numbers = [random.randint(0, 99999) for j in range(20000)]             for j, result in enumerate(thread_pool.imap(request_is_ok, numbers)):                 if j % 1000 == 0:                     print(i, j, result)          end = clock()         print("execution time: %s" % (end-start,)) 

Update 1:

Increasing the request_queue_size just gives you more space to store the requests that can't be executed at the time so they can be executed later. So the longer the queue, the higher the dispersion for the response time, which is I believe the opposite of your goal here. As for ThreadingMixIn, it's not ideal since it creates and destroy a thread for every request and it's expensive. A better choice to reduce the waiting queue is to use a pool of reusable threads to handle the requests.

The reason for running the server in another process is to take advantage of another CPU to reduce the execution time.

For the client side using a HTTPConnectionPool was the only way I found to keep a constant flow of requests since I had some weird behaviour with urlopen while analysing the connections.

And yes I didn't notice that allow_reuse_address was already activated in the HTTPServer class. The goal is to avoid the cost of opening and closing a connection for every request.

Answers 3

The norm is to only use as many threads as cores, hence the 8 thread requirement (including virtual cores). The threading model is the easiest to get working, but it's really a rubbish way of doing it. A better way to handle multiple connections is to use an asynchronous approach. It's more difficult though.

With your threading method you could start by investigating whether the process stays open after you exit the program. This would mean that your threads aren't closing, and will obviously cause issues.

Try this...

class FancyHTTPServer(ThreadingMixIn, HTTPServer):     daemon_threads = True 

That will ensure that your threads close properly. It may well happen automatically in the thread pool but it's probably worth trying anyway.

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment