Monday, July 18, 2016

Python Threading Issue on Windows

Leave a Comment

At first I thought I had some kind of memory leak causing an issue but I'm getting an exception I'm not sure I fully understand but at least I've narrowed it down now.

I'm using a while True loop to keep a thread running and retrieving data. If it runs into a problem it logs it and keeps running. It seems to work fine at first - at least the first time and then it constantly logs a Threading Exception.

I narrowed it down to this section:

while True:     yada yada yada...     #Works fine to this part     pool = ThreadPool(processes=1)     async_result = pool.apply_async(SpawnPhantomJS, (dcap, service_args))     Driver = async_result.get(10)     Driver.set_window_size(1024, 768) # optional     Driver.set_page_load_timeout(30) 

I do this because there's an issue spawning a lot of selenium webdrivers it times out eventually (no exception - just hangs there) and using this gave it a timeout so if it couldn't spawn in 10 the exception would catch it and go again. Seemed like a great fix. But I think it's causing problems in a loop.

It works fine to start with but then throws the same exception on every loop.

I don't understand the thread pooling well enough maybe I shouldn't constantly be defining it. It's a hard exception to catch happening so testing is a bit of a pain but I'm thinking something like this might fix it?

pool = ThreadPool(processes=1) async_result = pool.apply_async(SpawnPhantomJS, (dcap, service_args)) while True:     Driver = async_result.get(10) 

That looks neater to me but I don't understand the problem well enough to say for sure it would fix it.

I'd really appreciate any suggestions.

Update:

I've tracked the problem to this section of code 100% I put a variable named bugcounter = 1 before it and = 2 afterwards and logged this on an exception.

But when trying to reproduce it with just this code in a loop it runs fine and keeps spawning web drivers. So I've no idea.

Further update:

I can run this locally for hours. Sometimes it'll run on the (Windows) server for hours. But after a while it fails somewhere here and I can't figure out why.

An exception could be thrown because the timeout hits and the browser wouldn't spawn on time. This happens rarely but that's why we loop back to it.

My assumption here is I'm creating too many threads and the OS isn't having it. I have just spotted there's a .terminate for thread pooling maybe if I terminate the pool after using it to spawn a browser?

1 Answers

Answers 1

The question I came to in the final answer solved it.

I was using a thread pool to give the browser spawn a timeout as a workaround for the bug in the library. But I wasn't terminating that thread pool so eventually after the x amount of loops the OS wouldn't let it create another pool.

Adding a .terminate once the browser had been spawned and the pool was no longer needed solved the problem.

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment