The Future of Research in Operations Research

New technologies, particularly Internet-enabled technologies like cloud computing and social collaboration tools, are changing how mathematics is done.  This post showcases some of these emerging trends and lays out a vision for how the research and practice of Operations Research in particular could be much more effective.

Consider one of the most famous mathematical theorems of all: Fermat’s Last Theorem.  This theorem can be easily stated and readily understood, but has been truly devilish to prove, so much so that it lay unsolved for 360 years.  Finally in 1993 Andrew Wiles provided a 108-page proof after 7 years of effort.  This is a very “traditional” way of doing mathematics (albeit, pure mathematics), where a “lone wolf” makes a large single contribution; but it also hints at the limits of what one brain can achieve.

By contrast, there is an equally old conjecture, from 1611, that has also only recently been proven.  This is Kepler’s Conjecture, which states that the most efficient way to stack spheres has 74.048% density.  An example of this “packing” is hexagonal close packing, which is the way you typically see oranges stacked in a shop.  In 1998 a proof emerged for Kepler’s Conjecture which required the solution of about 100,000 Linear Programming problems by computer.  This shows the emergence of a different model for “doing maths”, that is applicable even for purposes of obtaining mathematical proofs.

It is no surprise that computers have become indispensable for mathematics, particularly in applied mathematics in general, and in Operations Research in particular.  What is more interesting is the emerging role of other new technologies on how mathematics might be done.

These trends include:

•    More data (e.g. availability of spatial data via Google Earth; the prevalence of data mining and “big data” analytics), faster machines (the well-documented exponential growth of computing power), and more machines (we are rapidly approaching the age of ubiquitous computing).
•    Internet-enabled technologies such as mashups, Software as a Service, open APIs, cloud computing, web 2.0, and social network integration.  Browsers are prevalent and using web and mobile apps instead of desktop apps – even for complex tasks like image editing – is now commonplace.  These newer trends all lay on top of the benefits and behaviours we have already become accustomed to from the Internet, such as: increasing market efficiencies (e.g. sectors that approach perfect information, such as hotel and flight bookings, and connecting businesses more directly to customers and suppliers), the commoditising of products (e.g. the so-called “race to the bottom” for pricing), the “long tail” (disproportionately favouring small businesses through the power of search e.g. niche goods and services); hypercompetition (where it becomes more difficult to monopolise by keeping customers ignorant of alternatives); and the abolition of geography (the so-called “global village”).
•    Competition, crowd sourcing and collaboration – e.g. log-in style tools which let people from anywhere collaborate together; there are many examples of competitive crowd-sourcing mentioned below.
•    Not to mention the remarkable explosion of knowledge enabled through Wikipedia, Google, etc.

There have been some astonishing examples of how these trends have been put to use, both in academic and commercial contexts:

•    SETI@Home – a large-scale use of distributed computing over the Internet for research purposes, analysing radio signals (using digital signal processing) to look for signs of extra-terrestrial intelligence.
•    The Folding@Home project – using a statistical simulation method for protein folding.  This used consumer computation (such as Playstation 3 cores and GPUs) to achieve millions of CPU days in short periods of time; it was much faster than the platform powering the SETI projects, and one of the largest scale distributed computing systems ever created.  In Jan 2010 it simulated protein folding in the 1.5ms range – thousands of times longer than previously achieved.
•    Netflix prize – a global competition for collaborative filtering of movie recommendations.  The competition was started in October 2006, and by June 2007 there were 20,000 teams competing.  The Netflix grand prize of $1m was eventually awarded in September 2009. The training data had 100 million ratings over 17,700 movies, and the winner had to beat Netflix’s Cinematch algorithm (RMSE .9514) by at least 10%.  Netflix’s own algorithm was beaten within 6 days of competition start.
•     Other examples of competitive or prize-driven activity: Millenium problems, 99Designs.com, Kaggle.com – like the Netflix prize, many of these would not be possible at large scale without internet enablement.
•    Human Genome Project – was a good example of exponential progress, where the pace  far outstripped predictions.
•     Citizen Science, arXiv and various open science initiatives show a trend towards more open and collaborative approaches to science.
•     Polymath project – in a 2009 post on his blog, Timothy Gowers asked “is massively collaborative mathematics possible?”. This post led to the collaborative solution of a hard combinatorics problem through blog comments, in around 7 weeks. The success of the so-called “Polymath1” problem has spawned additional Polymath collaborations.
•    In 1999, Garry Kasparov played and eventually won a game of chess against a “World Team” which decided its moves by the votes of thousands of chessplayers, including many rank amateurs.  Kasparov called this challenging game “the greatest game in the history of chess”, and co-authored a book as a result, constituting the longest analysis of a single game of chess.

These examples of massive collaboration and computation are not isolated curiosities, but the leading edge of a new approach to solving problems, that shows how knowledge and creativity can be harnessed by new technologies.  They show that there is an emerging new way of working which can be surprisingly effective.

Consider now the typical Operations Research researcher/academic (and, to a lesser extent, many OR practitioners):

•    They care about the HOW: algorithms, parameters, results
•    They want access to powerful computation and solvers
•    They want to collaborate with colleagues
•    They want to replicate and build upon the work of others
•    They’d like their contributions to be recognised – in both senses of the word: both visible, and valued.

The usual current way of working, through journal paper submissions, does not necessarily cater well for these needs.  Why not?  Consider a typical OR paper, which might contain an Abstract, a Literature Review, a Statement of a Problem, an Approach, some Results and experiments, and a Conclusion and References.  Such papers have some good properties: they are generally self-contained (you can find a lot out about the given problem once you’ve found the paper); they’re peer-reviewed, and – tellingly, perhaps – they’re easily replicated or extended by same author(s) (for their next paper!).  However, they are very often:

x    Difficult to search for and find – papers are often held behind pay-walls or restricted access (e.g. in University collections).
x    Hard to reproduce the results – in Operations Research, implementation details (parameters, source code, data sets) are important, but these aspects are typically not peer-reviewed – so how can the computations be verified, or replicated by others?  They fail Karl Popper’s basic test of falsifiability in science.  Even when source code is made available, it has likely been run in a completely different environment, making proper comparison of results or run-times difficult.
x    Long gestation times – the publication cycle from submission to publication is often extensive, and includes opaque periods when papers are reviewed.  This tends to encourage a culture of “the scoop” – a pressure to publish a new result before someone else.  Moreover, arguably the entire publication environment is self-serving and institutionally locked in (by universities and journal publishers).  Neither is it very democratic; when students and researchers leave academia and become OR practitioners, it can be difficult to keep close ties.
x      “Soft” links – the references in papers are usually simply textual, not hypertext links, even when published electronically.

The OR community prides itself on the Science of Better, of finding optimal or near-optimal ways to improve the world, but its own internal mechanism for improvement is itself inefficient (sometimes highly so).  In particular the process does not much allow for micro-contributions; the slow journal-focused approach inhibits rapid incremental development (which is somewhat ironic, considering that many papers offer slight improvements upon previously published results).  Nor does it make it easy to discover the state of the art for a particular problem – e.g. what are the best algorithms, the best test instances, and the best known results on those instances.  This must often be laboriously uncovered: even when the papers embodying the best known results are found, the associated data sets are non-standardised and often held on someone’s home page website; there is no standard code (and very often, no code at all) for reading in the data sets, and sometimes interpretation of the data is needed as documentation is lacking.  Certainly there are isolated but laudable aspirations towards “Reproducible Research” in science, but it has not been generally adopted by the OR community.

The question therefore becomes: How can we harness new technologies for Internet-based collaboration and crowd-sourcing, to do Operations Research more effectively?  What if we could complement (or even avoid!) slow journal-focused progress?  How can we improve upon silo’ed academic results and source code?

We can remove IT burdens by using the power of cloud computing – delivering powerful computational and collaborative capability through the browser, being flexible about how the underlying computational power is employed, and storing the data in the cloud.   Software as a service, or open-sourced, releases us from capital cost investment (which also allows us to switch more easily and not get too “invested” into one way of working); instead of the pursuit of mathematics research getting buried under software installation, environment issues and management (the tail wagging the dog), we can again make computation serve us.  Using servers in the cloud is usually not time-shared (as in the bad old days) – in fact computation power is generally cheap enough that we can now even be wasteful of server resources.  We can “spin up” compute instances and pay per hour if we like.  A good example is Gurobi, the leading commercial LP/MIP solver, which now offers Amazon EC2 instances prebuilt with Gurobi which can be essentially “rented” by time (in small or large amounts), so that no installation is necessary.

We can share models and data by using versioned wiki-style tools that store models and their results in the cloud.  I lay out a detailed vision for how this might work later.

We can aim to build an online, global OR community that collaborates frictionlessly, using forums and social network integration on a common platform.  Log-ins would facilitate a user and user group system; websites (such as StackOverflow and its ilk) also show that user reputations can work.  Particularly, a common platform can also build bridges between academics and practitioners.

If these aims seem ambitious, it’s worth pointing out that many of the pieces of the puzzle we need are already (conceptually at least) in place:

•    www.or-exchange.com / math.stackexchange.com
•    OR Librarywww.exampleproblems.com
•    www.wikipedia.org / www.proofwiki.org
•    Gurobi via AWS + Python
•    MathML / MathJax
•    Tons of modelling languages (Mosel, AMPL, etc)
•    Online IDEs – e.g. CodePad, Eclipse Orion, etc.
•    Collaborative referencing tools e.g. www.mendeley.com
•    Cloud storage and cloud platforms (examples are Google App Engine, Heroku, etc)

Some of these examples show skeletal websites whose communities have not really “taken off”.  Part of the reason is that they are not open enough – for example, OR Library, probably the biggest resource for OR data sets on the Internet, currently resides on someone’s home page – it is not editable (you cannot add a data set for a new problem for example), or integrated (even with basic links) to the associated results or source codes; nor does it allow for comments or discussion.  Other tools lack adoption simply for not being powerful, fully-featured or wide-ranging enough to attract a broad user base.

So we have tools which form the pieces of the puzzle, they just need to be put together well to serve a more ambitious purpose.  What is this ultimate purpose? I claim that our goal should be global crowd-sourced Operations Research in real time.

Here’s one way that it might work.  In the pictures I’ve cheekily called this rough mock up of such a platform “OR 4 ALL”.

The different types of pages are:

A Problem page describes a mathematical problem, for example, “The Capacitated Vehicle Routing Problem”.  These pages will look similar to the Wikipedia page, except that they will use a math-aware markup language to facilitate easy authorship (Wikipedia pages use images for equations, which provides a barrier to making micro-changes).  These pages will provide links to the associated data set pages, and the associated algorithm pages (see below).  These links will likely be ranked in some way (for example, the algorithm that has the most community votes might rank first, and there might also be an automatic ranking of the algorithms that have produced the best known solutions to the widest range of data sets).

Algorithm Approaches pages describe an approach to solving a problem.  Here there might be also be a discussion of the parameters and their effects, and perhaps some proofs that are specific to this algorithm.

Data Set pages list the data set instances for a given problem, and include links to associated results.

Implementation pages hold modelling language source code.  Python would probably work best for this.  The page would contain an online IDE (essentially, allowing the users to code right inside the browser).  Crucially, however, there is a Solve button, where you can apply the implementation to a data set, and the system will run it for you and compose a Results page with the solution.

Results pages, like the other pages, include history and versioned snapshots.  However, for this page it links right back to the implementation as it was run, in a timestamped fashion.  That way, if the implementation changes, the result can still be replicated.

Hence, you do not need to leave the OR4ALL web application in order to go right through from composing an algorithm, implementing it and running it, and comparing its results to the state of the art.

To various extents the different types of pages will have the following features:
•    WYSIWYG editing
•    versioning
•    comments
•    API access
•    voting
•    social media integration
•    links (e.g. related data sets & problems, and external links)
•    opt-in alerts for updates

The arrows shown in the diagram are typically “one to many” links.  For example, there will be many algorithmic approaches to one particular problem.  Similarly, for any one problem clearly there are also many data sets.  Each data set/algorithm combination can be run many times with small parameter or implementation changes, giving many result sets.

All changes and contributions are versioned, timestamped and archived.  Contributions are attributed to the appropriate user.  The entire system is searchable, with everything linked.  It allows for rich content, e.g. videos and animations where there is a pedagogical angle, and graphs and other analytical components where there is a data visualisation need.

The data would be also able to be accessed – potentially in unforeseen ways – via open APIs.  That is, other completely separate systems would be able to repurpose the data programmatically; as an example, a completely different web application could be devoted to analysing particular test instances and their results, and it would enjoy direct (read only) access to the required data via API to the OR4ALL system.

Cloud computation is achieved via open APIs.  The algorithms are proven and reproducible for given data sets, so practitioners or others looking to build further on an approach do not have to start from scratch; similarly, practitioners have a starting point if they want to expand or implement an algorithm in their own software.  Furthermore, other, non-default computation engines can potentially be “plugged in” behind the system (perhaps specific ones aimed at particular requirements, e.g. massively parallel dynamic programming).

Clearly a platform like this would be – at a minimum – an amazing boon for students.  But all users would benefit from the open, “many eyes” style of model and implementation development – the phenomenon that “somewhere out there, someone can easily help you”.  In particular, researchers and practitioners can find areas of common interest.

The system as described can start simple but grow in sophistication.  Reputation points might be gained for best solutions, with user points gained from previous best solutions decaying away over time.  Companies can offer bounties for the best solution to their difficult data set.  There would be plenty of opportunities for 3rd parties to provide a marketplace for alternative computation or analysis plug-ins via the APIs. Or, the system could potentially be monetised via advertising – for example, commercial OR companies could proffer their own software packages and services, with these ads appearing only on pages associated with that company’s specific areas of expertise.

Examples of the everyday use of this kind of system in practice might be (from a typical user’s point of view):
•    I get a notification email one day that someone is trying a new approach to solving a problem on a data set that I recently optimised;
•    I get a small boost in reputation because users upvoted an approach I suggested to someone else’s algorithm;
•    I use a proprietary piece of software (completely outside of the OR4ALL) system to find a new, closer-to-optimal solution to a hard problem, and upload it.  It ends up in the linked Results page for that problem, with an icon marking it as “unverified” (as it was run outside the OR4ALL computation framework);
•    I upload a teaching video to an Algorithm Approach page to help the students that Iam teaching.

Clearly there are some potential challenges to be overcome: the signal to noise ratio for one (keeping unhelpful contributions from polluting or distorting progressive development).  A system like this would also likely enforce standardisation of formulations, and to a lesser extent implementations – although this arguably has more benefit than downside.  Many in academia might also find it hard to move away from “research grant and published paper”-driven development, or otherwise be uncomfortable with voting or reputation-based status systems; clearly we would need appropriate and effective metrics to measure the contributions by each user or user group.

I firmly believe that this way of working will come about no matter what; I’d be very surprised if in 10 years time this more collaborative, technology-enabled approach had not overtaken journal-based research.  So wouldn’t it be better for the OR community to be self-directed and evolve in a very aware way towards this goal?  It might not even be particularly hard to build – as we have shown, most of the pieces are already proven possible in some form or other.  The possibilities and benefits of the system I have described are exciting and endless – if there was a truly open and powerful platform for OR with good community adoption, it would attain uses and purposes which we can now barely imagine.

Further reading
Reinventing Discovery – a book about internet-powered collaborative science
Wakari – a new, hosted Python data analysis environment that has some of the features described above

Pre-emptive optimisation

For one of our long-standing clients we have been running vehicle routing optimisations on a daily basis. A file of daily orders is uploaded into our Workbench system, and is split up into several regions, each of which needs to be separately optimised. A planner works through each region, going through a series of data checks (e.g. location geocode checking), before hitting an “Optimise” button.

All of the heavy lifting is done on a server (the planner accesses it through a web app via a browser), so it’s possible that the server could silently start up the required optimisations without the planner’s involvement, and in the (fairly common) case where the region does not require any data corrections, when the planner is up to the optimisation stage, the result could be immediately available (as it has already been run, or is in the process of being run). This idea now been implemented and took only a short amount of Python code.

Furthermore, it runs in parallel as each optimisation is itself split into a separate child process (running a C++ exe) which Linux distributes across the 8 cores of our server machine.
The pre-emptive optimisations are kicked off using Python’s multiprocessing package, as follows:

from multiprocessing import Process

p = Process(target=start_preemptive_optimisations, args=(…))

p.start()

Control is returned to the user at this point while the optimisations run in the background. Results are stored in a folder whose name is stored in a database table; when the planner then comes to press Optimise, the system checks if there have been any data corrections – if so, it runs the optimisation from scratch as usual (the pre-emptive result for that region is thus never referenced); however, if there are no corrections, the system simply uses the stored result from the folder.

The end result for the user is that in many cases the optimisations appear to run almost instantaneously. There are really no downsides as we do not pay for our servers on a CPU cycle basis, so we can easily be wasteful of our server CPU time and run these optimisations even if their results are sometimes not needed.

One “wrinkle” we discovered with this is that we had to make our process checking more robust. There is Javascript in our browser front end that polls for certain events, such as an optimisation finishing, which is indicated by a process ceasing to exist. The Python for this is shown below, where “pid” is a process ID. The function returns True if the given process has finished or not.
def check_pid_and_return_whether_process_has_finished(pid):
if pid and pid > 0:
multiprocessing.active_children()   # reap all zombie children first; this also seems to pick up non-children processes
try:
os.waitpid(pid, os.WNOHANG)       # this reaps zombies that are child processes, as it gives these processes a chance to output their final return value to the OS.
except OSError as e:
if int(e.errno) <> 10:  # the 10 indicates pid is not a child process; in this case we want to do nothing and let os.kill be the function to throw an exception and return True (indicating process is finished).
return True
try:
os.kill(pid, 0)   # doesn’t actually kill the process, but raises an OSError if the process pid does not exist; this indicates the process is finished.  Applies to all processes, not just children.
except OSError as e:
return True
else:
return True
return False

Note the “reaping” of zombie processes here, and the complication that arises if a process is not a direct child of the calling Python process (it might be a “child of a child”). In this case we use a call to the (in this case rather mis-named) function os.kill.

Successfully Implementing Commercial Mathematics

We have grown Biarri based on successfully bringing the power of mathematics to bear on business. We try hard to make operations research easier for business to digest.  Over the last few years there are a few things we have learned that are worth pointing out.

1. Target businesses that need analytical support and know they need it. Many businesses now have lots of data but that they lack the analytical ability to utilise that data. The businesses we target understand that smart decisions are based on ‘results gleaned from algorithmic insight and executed with the confidence that comes from really doing the math’. ‘Analytics’ is now part of the business improvement conversation however you need to target businesses that have complex problems that are solvable and need solving. Additionally, we have always had better success when the problem has a clear owner who will benefit directly from solving the problem.

2. Differentiate your offering. ‘Optimisation’ and now ‘Analytics’ are overused terms. As applied mathematicians we can certainly bring some real science to the table. But this is not enough. What has emerged as very important is our ability to make the mathematics digestible. We have tried hard to deliver ‘Accessible Optimisation’. We believe optimisation is only powerful if it is:

  • Easy to apply to real world operations
  • Easy to understand (intuitive)
  • Easy to access (usable)
  • Affordable with minimum capital spend
  • Fast and reliable

3. Focus on implementation practicalities. You need to keep your eye on the basics of project management. Some of the key issues for us have been:

  • Work hard to keep the scope tight by focusing on the 20% of features that deliver 80% of the value.
  • Use simple frameworks that everyone can understand. All our custom models and tools use a simple linear workflow/logic around a smart engine.
  • Stay close to your customer and practice the art of no surprises.
  • Excel is useful for modelling. It is available on every desk top, flexible, familiar, easy to use and accepted by non-technical managers.
  • Develop and implement prototype models quickly and have a simple way of upgrading them when needed.  For example, we always keep our prototype engines separate to the front-end so we can easily port them to different solution frameworks.

The Launch of Biarri’s WorkBench

With the impending launch of Biarri’s workbench and our ongoing close relationship with Schweppes for the daily routing of soft drink deliveries (an application of perhaps the most well known operations research problem: the vehicle routing problem), I thought that the following excerpt from a journal article submitted to the Asia Pacific Journal of Operations Research would be a very timely blog post.

The journal article is entitled “Real-Life Vehicle Routing with Time Windows for Visual Attractiveness and Operational Robustness” and it describes the vehicle routing algorithm we have implemented for Schweppes.

The excerpt details a specific example encompassing two things we are very passionate about at Biarri. First “Commercial Mathematics” – that is making OR (well not strictly just OR) work in the real world. And second, the revolutionary capabilities that the advent of cloud computing has for the delivery of software.

“Vehicle routing problems manifest in a remarkably wide range of commercial and non-commercial enterprises. From: industrial waste collection to grocery delivery; underground mining crew replenishment to postal and courier collection and delivery; inbound manufacturing component transportation to finished car distribution; in-home primary health care delivery to pathology specimen clearances from surgeries for analysis; and from coal seam gas field equipment maintenance to beverage distribution, to name but a few.

Automated planning systems used by industry at present are predominantly client-server or desktop based applications. Such systems are often: expensive, requiring a large upfront capital investment; accompanied by a large software deployment project requiring initial and ongoing IT department cooperation; customisable to a particular organisations requirements, however commonly retain a large amount of exposed functionality due to the breadth of the existing client base; and require substantial user training as the workflow is usually not restricted in a linear fashion …. Each of these characteristics constitutes a barrier to adoption of automated planning systems, and for most small to medium enterprises these barriers prove insurmountable.

With the advent of cloud computing and software as a service (SaaS) these barriers are being removed. SaaS: embodies a different commercial model; has essentially no IT footprint; mandates (as vendors may never directly interact with potential clients) simple intuitive linear workflows; and involves almost no end user training beyond perhaps an optional demonstration video.

The emergence of this new avenue for the delivery of optimisation based planning systems heralds, a heretofore, unparalleled opportunity for operations research practitioners to engage with a wider potential consumer base than ever before. However, the nature of the delivery mechanism requires the algorithms developed: to be robust and flexible (within their domain of application they must be capable to dealing with a wide range of input data); to have very short run times (the user base is more likely to be under time pressure than ever before); to produce high quality solutions (noting the inherent trade off between run time and solution quality); to be wrapped in a simple linear workflow (meaning it is always obvious what the next step in the planning process is); but above all, be able to produce real-life, practically implementable solutions, without the need for user training and/or experience.

For pure delivery, or pure pick up vehicle routing applications, real-life, practically implementable solutions are often synonymous with geographically compact, non-overlapping routes with little or no intra-route cross over. There are numerous reasons why such solutions are preferred …. If a customer cannot be serviced at the preferred time (e.g. the vehicle cannot get access, the customer is closed, another delivery is taking place, the customer is too busy), because the route stays in the same geographical area, it is easy to return to the customer at a later time. During busy traffic periods drivers are loathe to exit and re-enter a motorway to service individual customers. Even though such customers may be enroute to the
bulk of the customers the route services, thus incurring a minimum of additional kilometres, they may nevertheless be far from the majority of the customers the route services. If there is severe traffic disruption, it is easier to use local alternate routes between customers in a route that is geographically compact to ensure that pick-ups or deliveries can still be made. Third party transport providers, which prefer routes to be as simple as possible, may exert some influence over the planning process. Finally … it is easier to maintain customer relationships by assigning drivers to routes that routinely service a similar geographical area. In summary, solutions which are more visually attractive are more robust, and thus more likely to actually deliver the full extent of the cost savings that should flow from the use of automated planning systems.

This paper describes an algorithm for the vehicle routing problem with time windows, …. The algorithm is: robust and flexible; fast; wrapped in a user interface utilising a simple linear workflow and so requires no user training or experience; and produces high quality, visually attractive and practically implementable solutions.”

Thoughts on Point Solutions

Lately I have been thinking a bit about the advantages of small, tightly focussed web apps (so-called “point solutions”) that scratch a single little itch, versus larger, more powerful and general web apps that tend to deliver more of a total body rub. This question is of utmost importance to a company like Biarri that needs to place its development time and effort into the best channels.

The question was highlighted by a real-world problem a colleague posed recently: how to assign foursomes in rounds of Golf so that all of the players got to play with each other player at least once. It is not trivial to construct such a solution (if one even exists) by hand, if the constraints are “tight” enough (for example, 20 players and 8 rounds).

Small point solutions that solve a small but non-trivial problem like this might be fairly quick to develop and deploy on the web. But it doesn’t take much feature creep before you get a pile of extra “features” (particular requirements for some players, minimising the number of repeated pairings, right through to printing out score cards etc); before you know it (or more precisely, after months or years of hard coding) you’d have a full-blown Golf Tournament Scheduler. Such a web app might sell for much more, but would probably attract many less customers. And what happened to the poor casual golfer or golf tournament organiser on a shoestring budget who just wanted to solve his or her original golf player assignment problem?

In the spirit of acknowledging that the future is impossible to predict, I think Biarri must address more wide-ranging, lightweight “point solutions”, particularly at our fledgling stage. More mini-apps with a wider potential customer base will allow us to gauge which itches need the most scratching; more complex apps, as every seasoned developer knows, seem to always cause issues and problems – in short, sheer complexity – quite out of scale with the larger code line count; not to mention being harder to use and understand for users (more buttons!)

Those who have test-driven our Workbench solution will also know that, to some extent, we’re trying to have our cake and eat it to, by allowing these smaller “point” solutions to exist as workflows (standalone web apps) in their own right, whilst also being “nestable” – that is, able to be composited in a larger, more powerful workflow. Look out for Geocoding as a sub-workflow inside Travel Time Calculation, coming to the Workbench very soon. And who knows if the Biarri Golf Tournament Organiser will ever eventuate!

The Trajectory of a Vehicle Routing

I recently wanted to view an animation of a vehicle routing optimisation algorithm I have been working on.

The algorithm uses a two phase optimisation approach. The first phase is a construction heuristic. The second phase is guided local search meta heuristic. The algorithms primary focus is to produce a visually attractive solution. Visually attractive solutions are usually synonymous with compact, non-overlapping routes with little or no intra-route cross over.

A colleague suggested I have a look at pygame. I hunted around to try and find an existing script that did something similar to what I wanted and found the fracture simulator. It is always easier to modify something than to start from scratch!

I was pleasantly surprised at how easy it turned out to be. I am very much a python novice, and had never heard of pygame before, but have been programming for a number of years. I was able to get the basics of my animation up and running in one evening! This far exceeded my expectations!

The python script can be viewed here

The input file used to create the uploaded animation can be viewed here

Designing Distribution Networks for Utilities

Many utilities face the problem of designing distribution networks.  These networks need to reach all (or nearly all) households to distribute (or collect) the commodity supplied by the utility.  This may be electricity, gas, water, data (cable or optical fibre) or sewerage.

Typically these distribution networks are built as a subset of some underlying network.  Most commonly this is the street network.  The reason for this is obvious – the street network connects all households and businesses – which are the final demand points of utilities.  Some networks may piggyback on existing networks.  For example an optical fibre rollout may be done using existing power poles.

From a mathematical point of view we can consider the underlying points of demand (households or small collections of households) as nodes in a graph.  The potential connections between these nodes are arcs.  We refer to this network of potential connections as the underlying network.  If the distribution network can be represented as a tree (and this is often the case) then the problem of determining the best distribution network is similar to the minimum spanning tree problem solved on the underlying network (http://en.wikipedia.org/wiki/Minimum_spanning_tree).  This problem is straightforward to solve with an algorithm that is relatively easy to understand.  The main information required is the cost of the candidate arcs.  For example, it may be relatively cheap to string new cable between existing poles (for electricity or data networks), but much more expensive to dig a trench to install cable underground in an area with no overhead poles.

The problem faced by a utility will however have many practical considerations that cannot be handled by the classical minimum spanning tree algorithm.  These will include:

  • The problem is really one of building several spanning trees, each of which can handle a maximum demand.  This arises because of underlying technical restrictions such as transformer capacity; the need to have pumping stations for water, gas and sewerage; the maximum number of optical fibre connections to a hub; etc.
  • The cost of arcs in the tree depends on the downstream demand.  This arises because the pipes or cables need to get bigger as the tree collects more downstream demand.
  • Splitting nodes in the tree may be expensive.  This can occur because of splicing costs (in optical fibre networks) or the need to distribute pressure evenly.
  • The spanning trees will themselves need to be connected back to the “mains” network.  Indeed the problem of designing the mains network may need to be considered as part of the overall problem, though typically most of the cost will be in the feeder network due to the sheer number of connections required.
  • The maximum network length from the mains to a node is limited due to pressure/voltage losses.

Recently Biarri has undertaken a Proof of Concept project with a utility to look at designing a distribution network.  We have formulated and solved an optimisation based approach to give solutions guaranteed to be within a very small margin of the optimal solution.  On a sample network covering around 2000 households we were able to achieve:

  • An optimal solution in two minutes on standard PC’s
  • 6% reduction in equipment costs
  • Huge reduction in planning effort – it took many man weeks to design the same area by hand

The speed of the solution process frees up the network designer to consider higher level issues that are not part of the optimisation model, such as the possible saving from using different technical approaches.  These choices can be modelled as a series of “What if…?” scenarios.

We plan to extend the work done so far to cover even more of the practical issues faced by utilities.  A technical paper covering the work will be prepared in 2010.

Accessible Engines: Biarri Inside

From the start Biarri has been committed to providing easy access to powerful optimisation.  Our focus at inception was twofold being:

  • Custom optimisation models delivered through consultancy projects and
  • To deliver task specific engines made available on a Software as a Service (SaaS) basis via the “Biarri WorkBench” (WB).

The strong demand for our analytical consulting services continues to surprise us and the WB is now in Beta with a launch expected later in the year.

However, a new delivery channel has emerged to provide even greater access to our optimisation capability.  Over the last few months we have been developing optimisation formulations and engines that other companies are embedding in their own software.  This ‘Biarri Inside’ approach allows companies to quickly and affordably add optimisation capability to an existing application and leverage Biarri’s expertise in formulating and coding optimisation engines.

We are responding to the demand for the ‘Biarri Inside’ approach by ensuring the optimisation engines that we develop have a well documented and designed API so  each can be easily incorporated into existing solutions.  We are now involved in a number of projects that leverage this delivery channel for accessible optimisation.

Michael Forbes at the SMART Research Forum

Michael Forbes, Biarri Optimisation Expert, presented on “An Integrated Approach to Supply Chain Redesign: Team, Process and Tools” at the SMART Research Forum in Sydney in June. He examined how rapid and effective supply chain redesign requires a team representing all stakeholders; a process that identifies, quantifies and implements practical savings and tools that produce accurate estimates of the potential gains together with realistic plans. In his paper he presented an approach distilled from many years of international consulting on major supply chain redesign projects. Keys to this approach are the wide number of roles represented in the project team and the appropriate application of well chosen optimisation tools. A case study was presented in which significant savings were produced very quickly.