Building with .NET Core and Docker in TeamCity on AWS

By Dr Philip Kendall, Lead Analyst, Control F1.

Building with .NET Core and Docker in TeamCity on AWS

At Control F1, we’re always evaluating the latest technologies to see if and how they’ll fit with our clients needs. One of our core strengths is in .NET development, so we’ve recently been looking at the newly released Visual Studio 2017, along with .NET Core 1.1 and combining this with our ongoing use of Docker to create microservices. We always like all our projects to have continuous integration to ensure a consistent and repeatable build process – in our case, we use a TeamCity instance running in AWS for this. However, actually getting everything to build in TeamCity wasn’t quite as easy as we would have hoped due to a few minor niggles, so I’ve put together this blog post to capture everything that we needed to do.

To read the rest of this blog, please click here

Labels, Camera, Action…!

By Control F1 Lead Architect Phil Kendall.

Control F1 were asked earlier this year to work with a global pharma company to write the control software for a complex piece of physical hardware. Integrating all the moving pieces had proved a challenge, so our client needed a company with extensive experience in developing complex pieces of Windows software. From the specification supplied by our client, we quickly identified that there were going to be two main challenges in this project:

  • Integrating with the hardware in the project: four barcode-reading cameras from Cognex, and a Siemens S7 PLC, for which the control software (and physical machine) was supplied by HERMA.
  • Being able to develop and test the software. There was only one instance of the HERMA machine, and that was already installed on the client’s site (and it’s too big for our office anyway!); similarly we weren’t going to have enough cameras to let everybody working on the project have a full set of cameras.

Integrating with the hardware

Interfacing with the Cognex cameras themselves is relatively easy, as Cognex supply a full .NET SDK and set of device drivers to perform the “grunt work” of communicating with the cameras. However, the SDK is still relatively low-level: it lets you perform just about anything with the cameras, but obviously doesn’t have any business domain specific functions. On a technical note, the SDK is also a little bit “old school” and doesn’t make use of the latest and greatest .NET features – a decision which is completely understandable from a Cognex point of view who need their SDK to be useable by as many consumers as possible, but does mean that the SDK doesn’t quite fit neatly into a modern .NET application.

To work around both these issues, we developed a wrapper around the Cognex SDK that both encapsulates the low-level functionality in the Cognex SDK into the higher level business functionality that we needed for the project, and also presents a more modern .NET style interface, for example using lambda functions rather than delegates. The library has very much been designed to be a generic wrapper for the Cognex SDK so that we can re-use it in any future projects which use the Cognex cameras.

For the Siemens S7, we did a small amount of research and found the S7.Net Plus library. Once again, this enables low-level communications with the S7 PLC so we wrapped it in a higher level interface which implemented the business logic that HERMA had built on top of the S7 PLC.

Both libraries were tested when we had access to the hardware, the Cognex library by actually having a camera here at Control F1 HQ, and the HERMA library with assistance from HERMA who were able to set up a copy of their software at their site and give us remote access.

Developing and testing

As noted above, our big challenge here was how to develop and test the software without everybody having access to cameras and the HERMA machine. The trick here was simply to remove the requirement for everybody to have hardware: by developing a facade around the Cognex and HERMA libraries, we were able to make it so that we could use either the real interfaces to the hardware, or a emulator of each device which we developed. The emulators were configurable so that we could adjust their behaviour for various cases – for example, simulating a misread from one of the Cognex cameras, or a fault from the HERMA system.

The emulators were invaluable to us while developing the project: they allowed us to at one stage have three developers and a tester working on the project, and also to be able to have a demo VM which we could give to the client to let them test how the user interface was evolving, all without needing any hardware or for people to travel to anywhere – with the obvious savings of time and money all that brings.

So, did it all work?

Now, it’s all well and good developing against emulators, but emulators are no good if they don’t have the same behaviour as the real system. The moment of truth came when we sent our COO, Nick Payne, and Lead Architect/Developer, Phil Kendall, to the client’s site in order to put everything together on the real hardware… and the answer is that things worked pretty well. We’d be lying if we said everything worked perfectly first time, but the vast majority of the hardware was up and running within a day. The rest of the week was a pattern familiar to anyone who’s done integration testing before: mostly fairly quiet while we ran through the client’s thorough test plan (thanks Nick for his sterling work keeping everything running smoothly) interspersed with occasional moments of panic as the test plan revealed an issue (thanks Phil for some frantic and occasionally late-night hacking to fix the issues). By the end of the week, the client had signed the machine off to move into production, and Nick and Phil just about managed to get home at a reasonable time on Friday evening.

What did we learn?

From a Control F1 point of view, the most important knowledge we gleaned from this project was the work with did with the Cognex cameras and SDK – they’re some very nice pieces of kit, the SDK is a solid piece of code and we’ve now got the emulator framework we can use to accelerate development of any future projects using the Cognex cameras. Similarly, we’ve now got a way to interface with Siemens S7 PLCs which we can reuse for any future projects.

Other than that, the project reinforced a couple of good software engineering practices which we knew about:

  • Do the less understood bits of the project first to reduce risk. By focusing our initial development efforts on the hardware integration side, we were able to reduce the uncertainty in our planning – this in turn meant that we were able to confidently commit to the client’s timescales relatively early on the project.
  • Log everything. When you’re working with real hardware on a machine on a remote site, being able to get an accurate record of what happened when a problem occurred is invaluable. However, don’t log too much – if the camera is giving you a 30 line output, you don’t need to log the output as it passes through every level in the system as all you end up with then is a log file which is very hard to read.

Sound interesting?

If you’ve got a project which it sounds like we might be able to help you with, please drop us a line.

R, spray-can and Docker

Control F1 Lead Architect Phil Kendall gives some advice on performing R calculations in microservices.

Back in January this year, Control F1 started work as the lead member of the i-Motors consortium, a UK Government and industry funded* project working towards viable, commercially sustainable Smart Mobility applications for connected and autonomous vehicles. One of the key elements we will be delivering as part of the project is the capability to add predictive and contextual intelligence to connected vehicles, allowing all individual drivers, fleet managers and infrastructure providers to make better decisions about transport in the UK. At a coding level, this means we need to get some data science / machine learning / AI code written and deployed. This post gives a quick run through of the technology choices we made, why we made them and how we implemented it all.

Why R?

There are effectively two choices for doing “small scale” (i.e. fits into the memory on one machine) data science; R and Python (with scikit-learn). It just so happens that I’m much more an R guy than a Python guy, and the algorithms we wanted to deploy here were written in R.

Why Docker?

For i-Motors, we’ve gone down the microservices route for a lot of the common reasons, including the ability to independently improve the various components of our system without needing to do high risk “Big Bang” deployments where we have to change every critical part of the system at once. There are obviously alternatives to Docker for running microservices – while this post is Docker-specific, it shouldn’t be too hard to adapt what’s here to another container platform.

Why spray-can?

This is where it gets a bit more complicated! Excluding the definitely right out there on the cutting edge Docker for Windows Server 2016, running Docker means running Linux. At Control F1 we’re mostly a .NET house on the server side, so a number of the i-Motors components have been written in .NET Core and very happily deploy themselves on Docker. However, the .NET to R bridge hasn’t yet been ported to .NET Core, so there’s no simple way for a .NET Core application to talk to R at the moment. I investigated a couple of other options for bridging to R, including using node.js and the rstats package. Unfortunately, the official release of rstats doesn’t work with the latest versions of node, and while there are forks out there which fix the issue, basing a long-term project on a package without official releases didn’t seem like the wisest solution. The one option which did present itself was JRI, the Java/R Interface which I’d made some use of before when running on the JVM.

When it comes to JVM languages, I’m a big fan of Scala and the spray.io toolkit – again, the solution here isn’t particularly tied to Scala and spray.io and should be relatively easy to adapt to any other JVM language and/or web API framework.

Implementation

All the code for this blog post is available from Bitbucket. I’ll give a brief overview of the code here.

Startup

The web API is set up in RSprayCanDockerApp and RSprayCanDockerActor. This is pretty much a straight copy of the spray-can “Getting Started” app, with the notable exception that we bind the listener to 0.0.0.0 rather than localhost – this is important as the requests will be coming from an unknown source when deployed in Docker.

R integration

The guts of the R integration happens in the SynchronizedRengine class and its associated companion object. There are two non-trivial bits of behaviour here:

  • The guts of R are inherently a singleton object – there is one and only one underlying R engine per JVM. SynchronizedRengine.performCalculation() has a simple lock around the call into the R engine so that we have one and only one thread accessing the R engine.
  • The error handling is “a bit quirky”. If the R engine encounters an error, it calls the rWriteConsole() function in the RMainLoopCallbacks interface. The natural thing to do here would be to throw an exception, but unfortunately the native code between the Rengine.eval() call and the callback silently swallows the exception, so we can’t do that; instead we stash the exception away in a variable. If the evaluation failed (indicated by it returning null), we then retrieve the stashed away exception. In Scala, we wrap this into a Try object, but in a less functional language you could just re-throw the exception at this point.

Docker integration

The Docker integration is done via SBT Native Packager and is pretty vanilla; three things to note:

  • The Docker image is based on our “OpenJRE with R” image – this is the standard OpenJDK image but with R version 3.3 installed, and the JRI bridge library installed in /opt/lib. The minimal source for this image is also on Bitbucket.
  • We pass the relevant option to the JVM so that it can find the JRI bridge library: -Djava.library.path=/opt/lib
  • We set the appropriate environment variable so that the JRI bridge library can find R itself: R_HOME=/usr/lib/R

If you just want a play with the finished Docker container, it’s available from Docker Hub; just run it up as “docker -p 8080:8080 controlf1/r-spraycan-docker“.

Putting it altogether

For this demo, the actual maths I’m getting R to do is very simple: just adding two numbers. Obviously, we don’t need R to do that but in the real world you should be able to substitute your own algorithms easily – we’ve already deployed four separate machine learning algorithms into i-Motors based on this pattern. But as demos are always good:

$ curl http://localhost:8080/add/1.2/3.4

4.6

Where next?

What we’ll be working on in the near future is investigating how this solution scales with the load on the system – a single instance of the microservice will obviously be limited by the single-threaded nature of R, but we should be able to bring up multiple instances of the microservice (“scale out” rather than “scale up”) to handle the level of requests we expect i-Motors to produce. I’m not foreseeing any problems with this approach, but we’ll certainly be keeping an eye on the performance numbers of our “intelligence services” as we increase the number of vehicles in the system.

* i-Motors is jointly funded by government and industry. The government’s £100m Intelligent Mobility fund is administered by the Centre for Connected and Autonomous Vehicles (CCAV) and delivered by the UK’s innovation agency, Innovate UK.

A Sparkling View from the Canals

Control F1 sent Lead Developer Phil Kendall and Senior Developer Kevin Wood over to Amsterdam for the first European edition of Spark Summit. Here’s their summary of the conference.

One of the themes from Strata + Hadoop World in London earlier this year was the rise of Apache Spark as the new darling of the big data processing world. If anything, that trend has accelerated since May, but it has perhaps also moved in a slightly different direction as well – while the majority of the companies talking about Spark at Strata + Hadoop World were the innovative, disruptive small businesses, at Spark Summit there were a lot of big enterprises who were either building their big data infrastructure on Spark, or moving their infrastructure from “classical” Hadoop MapReduce to Spark. From a business point of view, that’s probably the headline for the conference, but here’s some more technical bits:

The Future of MapReduce

MapReduce is dead. It’s going to hang on for a few years yet due to the number of production deployments which exist, but I don’t think you would have been able to find anyone at the conference who was intending to use MapReduce for any of their new deployments. Of course, it should be remembered that this was the Spark Summit, so this won’t have been a representative sample, but when you’ve some of the biggest players in the big data space like Cloudera and Hortonworks joining in on the bandwagon, I certainly think this is the way that things are going.

In consequence, the Lambda Architecture is on its way out as well. Nobody ever really liked having to maintain two entirely separate systems for processing their data, but at the time there really wasn’t a better way. This is a movement which started to gain momentum with Jay Kreps’ “Questioning the Lambda Architecture” article last year, but as we now have an enterprise ready framework which can handle both the streaming and batch sides of the processing coin, it’s time to move on to something with less overhead, quite possibly Spark, MesosAkkaCassandra and Kafka, something which Helena Edelson implored us to do during her talk. Just hope your kids don’t go around saying “My mum/dad works with Smack”.

The Future of Languages for Spark

Scala is the language for interacting with Spark. While the conference was pretty much split down the middle between folks using Scala and folks using Python, how the Spark world is going was perhaps most obviously demonstrated by the spontaneous round of applause which Vincent Saulys got for his “Please use Scala!” comment during his keynote presentation. The theme here was very much that while there were people moving from Python to Scala, nobody was going the other way. On the other hand, the newcomer on the block here is SparkR, which has the potential to open up Spark to the large community of data scientists out there who already know R. The support in Spark 1.5 probably isn’t quite there yet to really open the door, but improvements are coming in Spark 1.6, and they’re definitely looking for feedback from the R community as to which features should be a priority, so it’s not going to be long before you’re going to see a lot of people using Spark and R.

The Future of Spark APIs

DataFrames are the future for Spark applications. Similarly to MapReduce, while nobody’s going to be killing off the low level way of working directly with resilient distributed datasets (RDDs), the DataFrames API (which is essentially equivalent to Spark SQL) is going to be where a lot of the new work gets done. The major initiative here at the moment is Project Tungsten, which gives a whole number of nice optimisations at the DataFrame level. Why is Spark moving this way? Because it’s easier to optimise when you’re higher up the stack – if you have a holistic view of what the programmer is attempting to accomplish, you can generally optimise that a lot better than if you’re looking at the individual atomic operations (the maps, sorts, reduces and whatever else of RDDs). SQL showed the value of introducing a declarative language for “little” data problems in the 1970s and 1980s; will DataFrames be that solution for big data? Given their position in all of R, Python (via Pandas) and Spark, I’d be willing to bet a small amount of money on “yes”.

On a related topic, if you’ve done any work with Spark, you’ve probably seen the Spark stack diagram by now. However, I thought Reynold Xin’s “different take” on the diagram during his keynote was really powerful – as a developer, this expresses what matters to me – the APIs I can use to interact with Spark. To a very real extent, I don’t care what’s happening under the hood: I just need to know that the mightily clever folks contributing to Spark are working their black magic in that “backend” box which makes everything run super-fast.

The Future of Spark at Control F1

I don’t think it will come as a suprise to anyone who has been following our blog that we’re big fans of Spark here at Control F1. Don’t be surprised if you see it in some of our work in the near future 🙂

Adventures in Spark on Elastic MapReduce 4

Lead Developer Phil Kendall on getting started with Spark on EMR.

In June, Spark, the up and coming big data processing framework, became a first class citizen on Amazon Elastic MapReduce (EMR). Last month, Amazon announced EMR release 4.0.0 which “brings many changes to the platform”. However, some of those changes lead to a couple of “gotchas” when trying to run Spark on EMR, so this post is a quick walk through the issues I found when getting started with Spark on EMR and (mostly!) solutions to those issues.

Running the demo

Jon Fritz‘s blog post announcing the availability of Spark on EMR contained a nice simple example of getting a Spark application up and running on EMR. Unfortunately, if you try and run through that demo on the EMR 4.0.0 release, then you get an error when trying to fetch the flightsample jar from S3:

Exception in thread "main" java.lang.RuntimeException: Local file does not exist.
	at com.amazon.elasticmapreduce.scriptrunner.ScriptRunner.fetchFile(ScriptRunner.java:30)

This one turns out to be not too hard to fix – the EMR 4.0.0 release has just moved the location of the hdfs utility so it’s now on the normal PATH rather than being installed in the hadoop user’s home directory. That can trivially be fixed by just removing the absolute path, but while we’re in the area, we can also upgrade to using the new command-runner rather than script-runner. Once you’ve done both those changes, the Custom JAR step should look like this:

HDFS

…and you can then happily run through the rest of the demo.

Spark Streaming on Elastic MapReduce

The next thing you might try is to get Spark Streaming running on EMR. On the face of it, this looks to be nice and easy – just push your jar containing the streaming application onto the cluster and away you go. And your application starts…. and then just sits there, steadfastly refusing to do anything at all. Experienced Spark Streaming folk will quite possibly recognise this as a symptom of the executors not having enough cores to run their workloads – each receiver you create occupies a core, so you need to ensure that there are enough cores in your cluster to run the receivers and to process the data. To some extent, you’d hope this isn’t a problem as the m3.xlarge instances that you get by default when creating an EMR cluster each have 4 cores, so there must be something else going on here.

The issue here turns out to be the default Spark configuration when running on YARN, which is what EMR uses for its cluster management – each executor is by default allocated only one core so your nice cluster with two 4 core machines in it was actually sitting there with three quarters of its processors doing nothing. Getting around this is what the “-x” option mentioned in Jon Fritz’s blog post did – it ensured that Spark used all the available resources on the cluster, but that setting isn’t available with EMR 4.0.0. The equivalent option for the new version is mentioned in the “Additional EMR Configuration Options for Spark” of the EMR 4.0.0 announcement: you need to set the “maximizeResourceAllocation” property. To do that, select “Go to advanced options” when creating the cluster, expand the “Edit software settings (optional)” section and then add in the appropriate configuration string: “classification=spark,properties=[maximizeResourceAllocation=true]“. This does unfortunately mean that the “quick options” for creating a cluster is pretty much useless when using Spark as you’re always going to want to be setting this option or a variant of it.

Getting to the Spark web UI

When you’re running a Spark application, you may well be used to using the Spark web UI to keep an eye on your job. However, getting to the web UI on an EMR cluster isn’t as easy as it might appear at first glance. You can happily point your web browser to http://<cluster master DNS address>:4040/ as usual, but that returns a redirect to http://ip-<numbers>.<region>.compute.internal:20888/proxy/application_<n>_<n>/ containing a reference to the internal DNS name of the machine which isn’t too helpful if you’re outside the VPC inside which the cluster is running. I haven’t found a perfect solution to this one yet, but you can just replace “ip-<numbers>.<region>.compute.internal” with the external DNS name of the master – so you’re pointing at something like http://<cluster master DNS address>:20888/proxy/application_<n>_<n>/ – and then you can happily browse around the web UI from there.

Onward and upward

With all that, I’ve pretty much got up and running with Spark on Elastic MapReduce 4. Now, it’s back to the actual Spark applications again…

How to use the MATLAB Compiler Runtime with AWS Elastic Beanstalk

One of Control F1’s current projects is working alongside the RAC on their RAC Advance platform, a revolutionary new technology that uses the latest diagnostic software to deliver an enhanced breakdown service for customers. 
As is so often the case when working with innovative new technologies, our team have uncovered a number of solutions and fixes that haven’t been documented in the past. And because we’re all about sharing, lead developer Phil Kendall explains here, (for the technically minded amongst you), some of the team’s learnings re: using MATLAB Compiler Runtime with AWS Elastic Beanstalk….

One of the components of the RAC Advance system (an ASP.NET web application) we’re working on makes use of MATLAB to perform some of the advanced calculations that make the platform a success.

In order to provide scalability and reliability, the component is deployed via AWS Elastic Beanstalk. When Control F1 began their work with the RAC, the component was hosted on a custom AMI which had to have the MATLAB Compiler Runtime manually installed, and then the AMI had to be maintained over time. One of the improvements we were hoping to make to the system was to reduce the number of manually maintained components in the system, so we began looking at whether it was possible to install the MATLAB Compiler Runtime automatically via Elastic Beanstalk’s configuration mechanism (.ebextensions).

To my slight surprise, this didn’t seem to be anything anyone had ever done before (or at least, had ever publicly documented how to do). Fortunately, the solution turned out to be not too complicated, although there are a couple of rough edges I’d like to smooth off:

  1. Download the appropriate version of the MATLAB Compiler Runtime from Mathworks’ website, and put this into an S3 bucket you control. You’ll need to make the file publicly readable.
  2. Create the following file and save it as “matlab.config” in a “.ebextensions” folder of your web application (note that the spacing is crucial here, and that’s it’s all spaces, not tabs):

sources:
  c:\\MatlabCompilerRuntime: https://s3-eu-west-1.amazonaws.com/your-bucket-name-goes-here/MCR_R2014b_win64_installer.exe
commands:
  01_install_matlab:
    command: setup.exe -agreeToLicense yes -mode silent
    cwd: c:\\MatlabCompilerRuntime\\bin\\win64
  02_modify_path:
    command: setx PATH "%PATH%;C:\Program Files\MATLAB\MATLAB Compiler Runtime\v82\runtime\win64"
  03_reset_iis:
    command: iisreset

(Note that the config files within the .ebextensions folder are run in alphabetical order so if you’ve already got other extensions in there, you may want to rename the file so that it’s run in the correct order).

To some extent, that’s all there is to it, but it’s probably worth an explanation as to how that’s working. Essentially, there are two main steps: the first, indicated by the “sources” stanza, downloads a ZIP file from the specified location (our S3 bucket) and expands it into the specified folder. While the MATLAB Compiler Runtime installer has an “.exe” extension, it’s actually a self-extracting ZIP file, and the Elastic Beanstalk functionality is perfectly happy to deal with this.

The second step is to actually run the installer – this is what is accomplished by the “01_install_matlab” stanza, which uses the silent install functionality of the installer. (If you’re using the 32-bit runtime, you’ll need to modify the path specified in the “cwd” line). Finally, we kick IIS to pick up a modified PATH which includes the native MATLAB DLLS (“03_reset_iis”).

While this solution works, as noted above there’s a couple of things I’d like to improve:

  1. Ideally, you wouldn’t have to make the file in the S3 bucket publicly readable. However, the “sources” functionality supports only publicly readable files at the moment, so there’s no easy way round this. It would be possible to install other components onto the box which would let you authenticate and download a protected file, but that seems like overkill. Hopefully Amazon will add authentication support for “sources” at some point.
  2. The observant will note I skipped over the “02_modify_path” stanza – what’s that for? As noted, when the MATLAB installer finishes, it modifies PATH to include the location of the native MATLAB DLLs. However, the installer runs as a background task, so the actual command returns instantly, and crucially before it has modified PATH. As far as I know, there’s no way of knowing when the installer has actually completed, so we bodge around this by manually adding what we know is going to be added to PATH, which means that IIS will be able to find the DLLs once they’ve been installed. This is obviously not the nicest solution in the world, but it works.

Hopefully this little guide helps anyone else who’s looking to do this sort of thing – please leave a comment if there’s anything you’d like to ask, or if you can help with those improvements!

Phil Kendall
Lead Developer