Agribusiness Optimisation Solutions

Maths and Machine Learning for Agribusinesses

Mathematics powered by computers is changing the world we live in. At Biarri we see this everywhere, across every industry, and I’m sure you do too. Recently we have delivered a number of Machine Learning and Mathematical Optimisation solutions for Agriculture businesses in Australia and were fortunate enough to be invited to speak at the recent Case IH agri-business conference in Mackay.

Ash Nelson, Biarri’s co-founder, presented on Maths and Predictive Analytics for better business decisions. He described how our everyday lives are being changed by corporations leveraging large data sets, advanced statistical analysis and powerful computing resources. Ash then outlined how these same set of technologies can be utilised to improve business decisions in agriculture. This includes optimising agricultural supply chains and port operations, reducing unplanned equipment failures by using intelligent predictive maintenance algorithms or to improve health and safety outcomes for farm workers by better identifying areas of best practice to inform injury prevention initiatives.

Are you interested in leveraging your data using advanced maths to make better business decisions? Don’t hesitate to get in touch with our friendly team.

Biarri and SaaS

SaaS deployments are now ‘mission critical’

Gartner recently published a survey citing that SaaS deployments are now ‘mission critical.’ Some of the key reasons behind this statement is that respondents looked for cost savings, an increase in innovation and accessibility to their systems as key drivers for the move away from local software solutions.

Joanne Correia, Gartner Research Vice President said,

“The most commonly cited reasons the survey found for deploying SaaS were for development and testing production/mission-critical workloads,” and went on to say “This is an affirmation that more businesses are comfortable with cloud deployments beyond the front office running salesforce automation (SFA) and email.”

This shows that companies are becoming more aware, and switched on to the benefits that cloud based software can bring to their company.

It was also demonstrated that on top of cost savings, accessibility, and innovation, SaaS based systems allowed for easier training and lower learning curves for employees.

“Non-IT professionals, often view the cloud strictly as a tool that they can use to reduce their operating costs,” and in turn effort.

Biarri empowering you through the cloud

Biarri was established in 2009 with the mission to provide accessible business optimisation to all clients regardless of size or budgets. We develop bespoke SaaS based solutions for you, with you allowing your solutions to meet your specific requirements.

We have been able to develop a bunch of applications for our clients to suit their specific; Advanced Planning and Scheduling, Workforce Management, Business Analytics and Supply Chain needs.

Get in touch and see how you can benefit from our solutions today!

 

What Google Does Right

I’ve appreciated Google’s mission and its modus operandi for a long time now.  I’ve avidly read Planet Google, many Wired articles, and a number of blogs and other pieces about the company.  But what I want to address here is how Google provides a great user experience, what enables it as a company to follow the path it does, and what smaller companies can learn from it.
Keeping it Simple
It’s easy to state but hard to do right, and often requires deep design to accomplish, but it’s one thing that Google does extremely well: it keeps its interfaces simple.  This ability is, for sure, enabled and exploited by the very nature of the company: it’s a web-based company through and through.  That means it can radically simplify so many things that mass consumer computing users find so hard: a big example being navigating and using a file-based storage system.  Instead, of course, everything is stored by Google (in its “cloud”, if you like) – and this simply obviates the need for a Save button, a Load Button, and all that junk.  Nor do you need any IT infrastructure to use most of Google’s products (email, for example).  By saving your documents or writing automatically – quietly and regularly, the way it should be done – the user never needs to even think about the where or how of storing data.  Except, that is, if you need to categorise – but here again Google makes sure that its core capacity – that of Search – is always front and center and powerful enough to find whatever you need.  A user experience should aim to empower the user, not baffle or frustrate them, and in this regard Google generally succeeds admirably.
The Power of Free
By providing many of its products free to the mass consumer market, Google owes its audience nothing.  This gives it free rein to change and improve (in short, to innovate).  By having lots of small but focused products, it can bring on or cull away products quickly (generally at the lightning-quick speed of the web world, and impressively fast for such a big company).  Here again Google understands right in its DNA both freemium and the web’s “Everything, free” tendencies.  Google is also very good at knowing what to keep hidden – its apps are great at hiding functionality that is less relevant to day-to-day usage from the user (they’re often there, but you have to dig a little to find them).
But Will It Scale?
Google as a company has shown an almost terrifying ability to grow, but to grow without collapsing under its own weight.  One way that they do this is by – in the main – using low cost easily available hardware (which has financial benefits as well as intangible benefits), even in huge data centers; a Commodity Computing approach (they even store their servers in shipping containers).  Development has an open feel to it, and is often open sourced or provides public platforms and APIs; Google Labs and techniques that expose Beta versions show Google developing software often in public view – compare to the secrecy that often surrounds Apple development.  Product support is often scaled by using open forums where members of the public helps each other.  Internally there is an almost astounding lack of management hierarchies.  In fact one could conjecture that Google is probably not really a big company as such, but a network of highly connected small companies that share common DNA and some common base technologies (often through open sourcing or open standards).  The shelter of the larger entity (not to mention its profitability) give it the ability to take risks – if one of the smaller companies/products fails, it can be easily absorbed.
Of course, Google’s flagship Web Search also scales (it has to, to have any chance of covering billions of web pages).  But interestingly, it seems to me that the success of Google’s PageRank algorithm – the core of it’s Web Search function – is largely because at heart the algorithm combines both the human and the machine in a very effective way – the human aspect is the importance of a page due to linking (a result of human activity) along with a series of quantifications (the rank).  But we are now also starting to see meta-data aware algorithms, that are getting nearer to natural speech, for example the Wolfram Alpha service (a so-called “computational engine”).  You can be sure that if Google truly cracks the problem of natural language search (which may or may not be equivalent to – perhaps a very dumb – AI), it will change the world (again!).  Indeed Google’s founders have stated that Google’s aim is to develop an Artificial Intelligence by way of Search, and there have been some startling successes: Google’s language translation service is apparently very good, and has resulted from a statistical approach enabled by massive data sets.
What Type of Company is Google, Anyway?
Google might be construed as an “information” company – after all, it wants to “provide access to all the world’s data”.  But there’s an important distinction to be made here – Google only cares about data insomuch as it is useful to someone (typically, consumers or businesses) – it does not care about information per se.  (That’s not to say Google will lose your data!).  The point is that Google is above all a technology company – it is enabling and automating the use of technology, predominantly software, but increasingly also hardware, to solve all sorts of engineering problems, and lots of data just happen to be the input.  Storing millions of search results, using millions of documents in different languages in order to automate translation, and many other examples support this view.  Google as a company is a master engineering problem solver, including solving some of its own internal problems.  Many of its products are happy accidents, or the results of its famous “20% time”, where its employees are given one day a week to pursue their own interests.  Google is like a giant R&D lab that also happens to be a corporation.  It also places huge importance on hiring the right people (smart ones), because it knows that great solutions come from clever minds – in fact CEO Larry Page personally signs off on every new hire.
What We Can Learn
The humble web start-up right through to the big unwieldy enterprise can learn much from Google’s approach, particularly if your products or services are targeting the mass consumer or massive business arenas:
  • Don’t discount the ability of technology to be a game changer.  Google has disrupted many industries.
  • User experience matters.  Strip away everything but the essentials to get the job done.  What’s left should work well.
  • Make sure your core product is healthy and pursue improvement and innovation as aggressively as you can.
  • Keep your technology and processes as open as possible.  Closed solutions harm innovation and sharing, which helps problem solving.
  • Scale through technology – automate as much as possible.
  • Give some of your product(s) away for free.

Further Reading: “Google Thinks Small” , Google’s “Ten Things We Know To Be True”, “How Google Works”

Cross Platform Development By David

I was involved in the development of a ‘simple’ application in C++ in windows and wanted to get it work in multiple versions of linux as well. By ‘simple’, I mean there is no windows GUI or links to other complicated third part libraries so a lot of the C++ should just port straight over to linux. Below are a few tips/lessons learned while I went about this task.

VirtualBox and Code Repository

I was working on a windows box and wanted to port to Ubuntu (a Debian flavour of linux) and Oracale Enterprise Linux (a Redhat flavour). The use of virtual technology is definely your friend here. As a result on my windows box I had virtual machines running Ubuntu and Oracle Linux Enterprise edition.

As we have a number of developers in our organisation working on a number of projects with libraries that are shared across several applications it is logical that we have a code repository. As I spend most of my time in a windows environment and am more familiar with it I prefer to do most of my editing in windows (the application was originally written in windows and most of our applications don’t need to work on other platforms).

I could have chosen to setup the code repository (in our case Mecurial) on each of the linux VMs. However, this would have been more time consuming and I don’t want to have to push code to the main shared repository and then pull onto the linux VMs everytime I want to test changes on other platforms, especially when the other platforms are all on the same computer. As a result, I set up shared folders in the VMs and made sure they pointed to the copy of the repo that I had in windows. Now I could easily make changes in windows and build and test them in all the environments.

Building and Testing

A couple of things to note is that while windows generally builds and runs quite happily without impacting what goes on in the linux world, the two linux worlds impact each other. This is because they both use the same makefile and clean/build binaries with the same names. I found that I had to be careful when building in different linux enviornments that the clean was done completely before the build started. The clean script when run in Ubuntu did not always clean/remove the binaries created in the Oracle Linux Enterprise environment. If the biaries didn’t get cleaned properly I got some build errors (e.g. /usr/bin/ld: cannot open output file executable_filename: Operation not permitted) or I got a segmentation fault at run time.

Different versions of g++ in Ubuntu (4.5.2) and Oracle Enterprise Linux (4.1.2) also meant that there were different compilation issus that needed to be dealt with but overall these were not too difficult to work through. Some of the issues resolved around simple compilation problems (e.g. g++ 4.1.2 was stricter with linking to libraries that started with the letters ‘lib’, 4.5.2 didn’t seem to mind as much).

Another main sets of issues were related to the use of third party libraries that we used (e.g. curl, coinosi). The coin library problems were overcome by simply ensuring the source code was downloaded, built and installed on the required linux platforms (i.e. no changes were required to the source code itself). Curl behaved a bit differently (i.e. it didn’t work for Oracle Enterprise Linux) but that was because a difference in the way different g++ versions treated the addition of strings. Once I made a small change to the code it worked fine.

The end result was that I managed to have the same source code build and run in 3 different environments: Windows, Ubuntu and Oracle Enterprise Linux.

js/css resource serving in python apps with Fanstatic

I’ve just been checking out Fanstatic, a resource publishing/static file serving solution for wsgi python apps. I’ve been contemplating something like this as our javascript and css dependencies are getting more complex. It would also be useful to have some form of automatic cache invalidation so users don’t have to do a special browser refresh when we update our applications.

It’s easy to set up with CherryPy

from fanstatic import Fanstatic

if __name__ == "__main__":

    app = cherrypy.Application(Root())
    app.wsgiapp.pipeline.append(('repoze.who', setup_auth))
    app.wsgiapp.pipeline.append(('beaker', setup_session_storage))
    app.wsgiapp.pipeline.append(('fanstatic', Fanstatic))
    cherrypy.quickstart(app, config='workbench.conf')

after that is done, you can jquery.need() in the widget/template that needs jquery, and similar for our other dependencies. Has anyone else used fantastic? What are other solutions to dependencies and serving js/css? Is there an easier and better solution? Wrapping new libraries for fanstatic looks like a bit of effort but I haven’t explored it much yet.

Loki

Editing vector layers with Quantum GIS

I’ve been looking around for an open source alternative to the excellent Desktop GIS Mapping Application MapInfo Pro. I’ve installed and played with both Quantum GIS (QGIS) and MapWindow. The latter is a little bare bones and does not seem to include any geometry editing, so I’ve been focussing instead on QGIS.

QGIS is a great application, though it is quite noticably slower to render very large mapping layers than MapInfo Pro. However – and this is, strangely, not very well known or documented – QGIS will only let you edit layers which are Shapefiles. There is no documentation that I can find which says why the “Toggle Editing” function is always disabled for other files – this is very confusing/frustrating if you don’t know about this Shapefile limitation.

There is a converter within QGIS (which is really just a UI on ogr2ogr) to convert between TAB and SHP formats, but Shapefile layers are rather limited because you can’t mix geometries within a layer – both nodes (points), and lines, for example. This is rather a problem because the map data I have has both points and lines – the points are needed to style the line endpoints and allow for data attributes on them. Separating points and lines into different layers in this context is a bit of a pain. Shapefiles also have various other limitations – for example, field names can be no longer than 10 characters.

Perhaps hooking up QGIS to an SQLite database with Spatialite extension might make managing the map data layers more streamlined – one for a rainy day…

Installing Cplex in Linux Ubuntu

I’ve been trying to install ILOG’s Cplex product on a 64 bit Ubuntu machine. There are a few small hiccups I encountered.

First, try to install as root with

./cplex_studio122.linux-x86.bin

On my machine I’m installing to /opt/ILOG/CPLEX_Studio122. If you get an error like “jre/bin/java: not found” then you need the “32 bit libs” package:

apt-get install ia32-libs

(You may also need to set the path with LD_LIBRARY_PATH=/usr/lib32). The 32 bit libraries seem to be required only for the installer (at least, they are not needed for programs that just link with the static Cplex libs).

After installing, you may get a build error (running as a non-root user) that it can’t find the cplex header files. Try ls /opt/ILOG/CPLEX_Studio122 and see if there are permission denied messages – the installation seems to screw up some permissions on this folder, however this is easily fixed with chmod +r /opt/ILOG/CPLEX_Studio122.

If you are using COIN-OR‘s Osi class OsiCpxSolverInterface you will also need the following at the top of your OsiCpxSolverInterface.cpp file:

#include “/opt/ILOG/CPLEX_Studio122/cplex/include/ilcplex/cplex.h”

A typical Makefile snippet which includes Cplex and COIN/Osi might then look like:

my_objects = YourFile.o OsiCbcSolverInterface.o OsiCpxSolverInterface.o

CPPFLAGS = -fPIC -I/usr/include -I/usr/include/coin -DNDEBUG -I/opt/ILOG/CPLEX_Studio122/cplex/include/ilcplex

LIBFLAGS = -l:libCbc.so -l:libCbcSolver.so -l:libCoinUtils.so -l:libOsi.so -l:libOsiClp.so -l:libClp.so -l:/opt/ILOG/CPLEX_Studio122/cplex/lib/x86-64_sles10_4.1/static_pic/libcplex.a -l:/opt/ILOG/CPLEX_Studio122/cplex/lib/x86-64_sles10_4.1/static_pic/libilocplex.a

my_program : $(my_objects)
g++ -Wall -fPIC -shared -o my_program $(my_objects)

.PHONY : clean

clean :
rm *.o my_program

MapInfo tips

I’ve had reason to manipulate some spatial data recently using MapInfo Professional, and picked up a few tips that I thought might be useful to someone else one day:

  • You can use SQL conditions that select objects by their spatial properties. Use syntax like: Str$(obj) = “line”, and Str$(obj) = “point”, Str$(obj) = “region” etc.
  • To change the projection, use File / Save Copy As and click the Projection button.
  • Get used to using the arrow keys to pan around the map, and the mousewheel to zoom in and out – this saves having to switch to the Grabber tool all the time to move around.
  • Press “S” to go into “Snap” mode. This is very useful when creating lines, to make sure they attach to nodes (note the nodes don’t have to be in the same layer when your lines snap). When in Snap mode you will see the text “SNAP” in the status bar at the bottom of the MapInfo main window.
  • The menu item Map / Change View is useful to go straight to a given lat/long coordinate. (It centres the view on your entered coordinate). You will need to be careful however if you have multiple layers open that have different coordinates or projections.
  • To merge two different layers:

* First, the tables have to have the same structure. Use Table / Maintenance / Table Structure to add and remove fields as required.
* Then use either Table / Update Column or Table / Append Rows To Table. If using Update Column on node/arc layers (to copy data from the node or arc objects of one table to the corresponding objects on another), I found using Join with Intersects worked best.

  • To separate a single layer that contains nodes and arcs into two layers:

* Run an SQL Select query with the condition Str$(obj) = “line” – this will select only the arcs.
* Then Save the selection to a separate file (which will be the arcs layer), and delete the selected arcs from your original layer (leaving only the nodes).

Cross-platform development

During the course of developing Biarri’s flagship Workbench product, we’ve taken pains to ensure that our (GUI-less) optimisation “engines” work well under both Windows and Linux operating systems (so-called cross-platform). This turns out to be relatively easy as long as you stay away from the big OS-specific frameworks (e.g. Microsoft’s MFC/COM/ATL etc). We’ve picked up some handy tips along the way, particularly applicable to C++ development, which are worth sharing here.

  • Be aware of differences in line endings – Windows uses carriage return and line feed \r\n, while Linux/Unix uses just line feed \n. (Note that Visual Studio will show files with Linux line feeds correctly, but Notepad won’t – this is one way to tell what line endings your file has in Windows). This can be particularly important when importing data e.g. into databases where the file originates from another OS.
  • Always use forward slashes for file paths, not backslashes. Also, file names and folder paths are case sensitive under Linux but not under Windows. And don’t assume there is a C: or D: drive!
  • You may have to be careful writing to temporary files and folders. In Linux /tmp is often used; in Windows /[user]/AppData/local/temp (location of the TEMP environment variable – e.g. type “%TEMP%” into the start menu or Windows Explorer). For Linux, it is sometimes necessary to manipulate a folder’s “sticky bit” to ensure that the folder is accessible by other users (e.g. a Postgres database user) – e.g. in Python:
os.chmod(temp_dir_name, os.stat(temp_dir_name).st_mode | stat.S_ISVTX | stat.S_IRGRP | stat.S_IROTH | stat.S_IWGRP | stat.S_IXOTH)
  • Be aware of the differences in file permissions in Windows and Linux. In Linux files have an “executable” bit. chmod a+x [file] makes a file an exe, which can then be run with “./filename”.

For C++ development:

  • Name all cpp and h files in lower case if possible. Files are case sensitive in Linux and this includes #include’s!
  • For compiling with GCC under Linux, the last line in a C++ file must be blank.
  • In Linux C++ programs, general exception handling with catch(…) does not work. You can use sighandlers instead (see this for example), though it’s not as good – it is more equivalent to an exit(), with a chance to clean up.
  • Beware doubles comparisons and inequality checking, at least in C++ programs. Always use a delta i.e. A == B may not be the case in both Windows and Linux if they are essentially the same number so use fabs(A – B)
  • Build tips for Linux: Type “make” when you are in the directory to build the project. This will search for a file called “Makefile” and run it. (Use “make -f filename” to make from a different makefile). To force a recompile you can “touch” a file using “touch filename”.
    To clean out all object files type “make clean” (as long as your make file defines what cleaning does…). Use “make -j4” to run make with for concurrent jobs, to take advantage of multicore.
  • In bash, to get a recursive line count of .cpp/.h files: find [directory] -type f -name *.cpp -exec wc -l {} \; | awk ‘{total += $1} END{print total}’

The Launch of Biarri’s WorkBench

With the impending launch of Biarri’s workbench and our ongoing close relationship with Schweppes for the daily routing of soft drink deliveries (an application of perhaps the most well known operations research problem: the vehicle routing problem), I thought that the following excerpt from a journal article submitted to the Asia Pacific Journal of Operations Research would be a very timely blog post.

The journal article is entitled “Real-Life Vehicle Routing with Time Windows for Visual Attractiveness and Operational Robustness” and it describes the vehicle routing algorithm we have implemented for Schweppes.

The excerpt details a specific example encompassing two things we are very passionate about at Biarri. First “Commercial Mathematics” – that is making OR (well not strictly just OR) work in the real world. And second, the revolutionary capabilities that the advent of cloud computing has for the delivery of software.

“Vehicle routing problems manifest in a remarkably wide range of commercial and non-commercial enterprises. From: industrial waste collection to grocery delivery; underground mining crew replenishment to postal and courier collection and delivery; inbound manufacturing component transportation to finished car distribution; in-home primary health care delivery to pathology specimen clearances from surgeries for analysis; and from coal seam gas field equipment maintenance to beverage distribution, to name but a few.

Automated planning systems used by industry at present are predominantly client-server or desktop based applications. Such systems are often: expensive, requiring a large upfront capital investment; accompanied by a large software deployment project requiring initial and ongoing IT department cooperation; customisable to a particular organisations requirements, however commonly retain a large amount of exposed functionality due to the breadth of the existing client base; and require substantial user training as the workflow is usually not restricted in a linear fashion …. Each of these characteristics constitutes a barrier to adoption of automated planning systems, and for most small to medium enterprises these barriers prove insurmountable.

With the advent of cloud computing and software as a service (SaaS) these barriers are being removed. SaaS: embodies a different commercial model; has essentially no IT footprint; mandates (as vendors may never directly interact with potential clients) simple intuitive linear workflows; and involves almost no end user training beyond perhaps an optional demonstration video.

The emergence of this new avenue for the delivery of optimisation based planning systems heralds, a heretofore, unparalleled opportunity for operations research practitioners to engage with a wider potential consumer base than ever before. However, the nature of the delivery mechanism requires the algorithms developed: to be robust and flexible (within their domain of application they must be capable to dealing with a wide range of input data); to have very short run times (the user base is more likely to be under time pressure than ever before); to produce high quality solutions (noting the inherent trade off between run time and solution quality); to be wrapped in a simple linear workflow (meaning it is always obvious what the next step in the planning process is); but above all, be able to produce real-life, practically implementable solutions, without the need for user training and/or experience.

For pure delivery, or pure pick up vehicle routing applications, real-life, practically implementable solutions are often synonymous with geographically compact, non-overlapping routes with little or no intra-route cross over. There are numerous reasons why such solutions are preferred …. If a customer cannot be serviced at the preferred time (e.g. the vehicle cannot get access, the customer is closed, another delivery is taking place, the customer is too busy), because the route stays in the same geographical area, it is easy to return to the customer at a later time. During busy traffic periods drivers are loathe to exit and re-enter a motorway to service individual customers. Even though such customers may be enroute to the
bulk of the customers the route services, thus incurring a minimum of additional kilometres, they may nevertheless be far from the majority of the customers the route services. If there is severe traffic disruption, it is easier to use local alternate routes between customers in a route that is geographically compact to ensure that pick-ups or deliveries can still be made. Third party transport providers, which prefer routes to be as simple as possible, may exert some influence over the planning process. Finally … it is easier to maintain customer relationships by assigning drivers to routes that routinely service a similar geographical area. In summary, solutions which are more visually attractive are more robust, and thus more likely to actually deliver the full extent of the cost savings that should flow from the use of automated planning systems.

This paper describes an algorithm for the vehicle routing problem with time windows, …. The algorithm is: robust and flexible; fast; wrapped in a user interface utilising a simple linear workflow and so requires no user training or experience; and produces high quality, visually attractive and practically implementable solutions.”