Pre-emptive optimisation

For one of our long-standing clients we have been running vehicle routing optimisations on a daily basis. A file of daily orders is uploaded into our Workbench system, and is split up into several regions, each of which needs to be separately optimised. A planner works through each region, going through a series of data checks (e.g. location geocode checking), before hitting an “Optimise” button.

All of the heavy lifting is done on a server (the planner accesses it through a web app via a browser), so it’s possible that the server could silently start up the required optimisations without the planner’s involvement, and in the (fairly common) case where the region does not require any data corrections, when the planner is up to the optimisation stage, the result could be immediately available (as it has already been run, or is in the process of being run). This idea now been implemented and took only a short amount of Python code.

Furthermore, it runs in parallel as each optimisation is itself split into a separate child process (running a C++ exe) which Linux distributes across the 8 cores of our server machine.
The pre-emptive optimisations are kicked off using Python’s multiprocessing package, as follows:

from multiprocessing import Process

p = Process(target=start_preemptive_optimisations, args=(…))

p.start()

Control is returned to the user at this point while the optimisations run in the background. Results are stored in a folder whose name is stored in a database table; when the planner then comes to press Optimise, the system checks if there have been any data corrections – if so, it runs the optimisation from scratch as usual (the pre-emptive result for that region is thus never referenced); however, if there are no corrections, the system simply uses the stored result from the folder.

The end result for the user is that in many cases the optimisations appear to run almost instantaneously. There are really no downsides as we do not pay for our servers on a CPU cycle basis, so we can easily be wasteful of our server CPU time and run these optimisations even if their results are sometimes not needed.

One “wrinkle” we discovered with this is that we had to make our process checking more robust. There is Javascript in our browser front end that polls for certain events, such as an optimisation finishing, which is indicated by a process ceasing to exist. The Python for this is shown below, where “pid” is a process ID. The function returns True if the given process has finished or not.
def check_pid_and_return_whether_process_has_finished(pid):
if pid and pid > 0:
multiprocessing.active_children()   # reap all zombie children first; this also seems to pick up non-children processes
try:
os.waitpid(pid, os.WNOHANG)       # this reaps zombies that are child processes, as it gives these processes a chance to output their final return value to the OS.
except OSError as e:
if int(e.errno) <> 10:  # the 10 indicates pid is not a child process; in this case we want to do nothing and let os.kill be the function to throw an exception and return True (indicating process is finished).
return True
try:
os.kill(pid, 0)   # doesn’t actually kill the process, but raises an OSError if the process pid does not exist; this indicates the process is finished.  Applies to all processes, not just children.
except OSError as e:
return True
else:
return True
return False

Note the “reaping” of zombie processes here, and the complication that arises if a process is not a direct child of the calling Python process (it might be a “child of a child”). In this case we use a call to the (in this case rather mis-named) function os.kill.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published.