Mike's Page

Informatics, Development, Cycling, Data, Travel ...

Month: May 2018

Bumblebee May 2018 Update

First flight

Last week we flew the bumblebee tracker for the first time.

It was really useful having lots of volunteers helping out!

Everyone helped fill the balloon!

Everyone helped fill the balloon!

It was surprisingly easy to get the balloon in the air, but the slightest gust did cause quite a bit of movement.

Balloon in flight (taken by Mike Livingstone)

Balloon in flight (taken by Mike Livingstone)

We successfully tracked the retroreflector from about 20m up. I need to make the software quicker.

Second flight

Today (28th May) we tested the bumblebee experiment again. It was almost the same as a test two weeks ago but the reflector used was slightly larger than before (about 0.9 cm2) and we used a filter. Again we tested the system by tracking a reflector attached to a string.


The ground and plants reflect relatively little UV and visible-violet spectrum. So we added to the camera a 390nm bandpass filter, that ranges from about 335nm [near UV] to 445nm [violet/deep-blue]. This had the effect of filtering out much of the background leaving the reflections from the camera flash/retroreflector.

We tested the system by moving the reflector on the end of a black string along the length of the site, to see if the system could identify its location.

Demonstration of tracking reflector from balloon mounted system

Demonstration of tracking reflector from balloon mounted system. Actual location marked with yellow circle. The identified location is marked with a white cross. The confidence in the identification is written in the title. Photos 5 seconds apart. Exposure: 2ms, Gain 30dB, blocksize/step/offset: 20/10/3.

We were able to track the fake-bee successfully – probably from about 30m high. I think it’ll work considerably higher, unfortunately we’ve not tested it that high yet. 30m feels suddenly really high when you’re looking up at it!

Balloon connection failure

The experiment came to an abrupt end when the balloon rubber loop failed, causing the hardware to fall and crash catastrophically. The balloon escaped.

I’d followed these instructions from public lab. However the single-rubber-loop wasn’t sufficient and failed.

Single rubber loop problem

Single rubber loop problem

The crashed system:

Remains for the crashed experiment!

Remains of the crashed experiment!


  • The filter seems to help a lot!
  • The retroreflective paint doesn’t work
  • Three tethers is more stable than two

Most important are safety lessons around the fallen experiment:

  • Double-up the rubber hoops
  • Add a parachute
  • Wear hard-hats
  • Place an exclusion zone – during the experiment
  • Make it lighter

Next steps

  • Build new lighter version
  • Test if we can stick things to insects (get the hang of this before we fly again)

Coregionalised Air

Air pollution coregionalised between the US embassy and Makerere campus

Air pollution coregionalised between the US embassy and Makerere campus

Next week I’ll be presenting at Manchester’s Advances in Data Science conference, on the air pollution project. I’ve written an extended abstract on the topic.

We use a custom-crafted kernel for coregionalising between multiple sensors, to allow us to make probabilistic predictions at the level of the high-quality reference sensor, across the whole city, using the low-quality noisy sensors. We estimate the coregionalision parameters using training data we’ve collect – which ideally should include close or colocated measurements from pairings of sensors.

In future we hope to:

  1. Include the uncertainty in the coregionalisation (e.g. by integrating over the distribution of the coregionalisation parameters, e.g. using CCD.
  2. Allow this coregionalisation to vary over time. This will require non-stationarity, and is probably best achieved using a more flexible, non-analytic solution. E.g. writing the model in STAN
  3. .

  4. Updating the model in real time. I think another advantage of using a STAN or similar framework would be the gradual inclusion of new MC steps incorporting new data, as we throw out old data, this allows the gradual change of coregionalisation to be incorporated.


Building flat coregionalisation kernel

Building flat coregionalisation kernel

We can’t just use the standard coregionalisation kernel, as we’re not just kronecker-product multiplying a coregionalisation matrix with a repeating covariance matrix. Instead we want to element-wise multiply a matrix that expresses the coregionalisation with another matrix that expresses the covariance due to space and time proximity (see above figure).

Here is the GPy kernel code to do this;

import GPy
import numpy as np
from GPy.kern import Kern
from GPy.core.parameterization import Param
#from paramz.transformations import Logexp
#import math
class FlatCoreg(Kern): 

    def __init__(self, input_dim, active_dims=0, rank=1, output_dim=None, name='flatcoreg'):
        super(FlatCoreg, self).__init__(input_dim, active_dims, name)

        assert isinstance(active_dims,int), "Can only use one dimension"
        W = 0.5*np.random.randn(rank,output_dim)/np.sqrt(rank)
        self.W = Param('W', W)
        self.link_parameters(self.W) #this just takes a list of parameters we need to optimise.

    def update_gradients_full(self, dL_dK, X, X2=None):
        if X2 is None:
            X2 = X.copy()
        dK_dW = np.zeros([self.W.shape[1],X.shape[0],X2.shape[0]])
        for i,x in enumerate(X):
            for j,x2 in enumerate(X2):
                wi = int(x[0])
                wj = int(x2[0])
                dK_dW[wi,i,j] = self.W[0,wj]
                dK_dW[wj,i,j] += self.W[0,wi]
        self.W.gradient = np.sum(dK_dW * dL_dK,(1,2))

    def k_xx(X,X2,W,l_time=2.0,l_dist=0.1):
        #k_time = np.exp(-(X[0]-X2[0])**2/(2*l_time))
        #k_dist = np.exp(-(X[1]-X2[1])**2/(2*l_dist))
        k_coreg = coregmat[int(X[2]),int(X2[2])]
        return k_coreg #k_time * k_dist * k_coreg 
    def K(self, X, X2=None):
        coregmat = np.array(self.W.T @ self.W)
        if X2 is None:
            X2 = X
        K_xx = np.zeros([X.shape[0],X2.shape[0]])
        for i,x in enumerate(X):
            for j,x2 in enumerate(X2):
                K_xx[i,j] = coregmat[int(x[0]),int(x2[0])]
        return K_xx

    def Kdiag(self, X):
        return np.diag(self.K(X))

k = (GPy.kern.RBF(1,active_dims=[0],name='time')*GPy.kern.RBF(1,active_dims=[1],name='space'))*FlatCoreg(1,output_dim=3,active_dims=2,rank=1)
#k = FlatCoreg(1,output_dim=3,active_dims=2,rank=1)

This allows us to make predictions over the whole space in the region of the high quality sensor, with automatic calibration via the W vector.

DASK and ec2 – use daskec2lite

I’ve started having the same problem as in this issue. I think something else has been updated which has caused the new error. As it says on the dask-ec2 readme, dask-ec2’s project is now deprecated – and so I didn’t try fixing the new bug. I tried for a while using kubernetes (kops, terraform, etc), but it’s quite a pain to set up (not well documented yet maybe) and is serious overkill for what I want (and probably what a lot of people want…). So instead…

I’ve written a replacement for dask-ec2, I’ve called daskec2lite.

It needs a little bit more work but is nearly finished. I’ll hopefully have some time later in the year to get it to a more ‘release’ state, but feel free to use it.

daskec2lite --help

usage: daskec2lite [-h] [--pathtokeyfile [PATHTOKEYFILE]]
[--keyname [KEYNAME]] [--username [USERNAME]]
[--numinstances [NUM_INSTANCES]]
[--instancetype [INSTANCE_TYPE]] [--imageid [IMAGEID]]
[--spotprice [SPOTPRICE]] [--region [REGION_NAME]]
[--wpi [WORKERS_PER_INSTANCE]] [--sgid [SGID]] [--destroy]

Create an EC2 spot-price cluster, populate with a dask scheduler and workers.
Example: daskec2lite --pathtokeyfile '/home/mike/.ssh/research.pem' --keyname
'research' --username 'mike' --imageid ami-19a58760 --sgid sg-9146afe9

optional arguments:
-h, --help show this help message and exit
--pathtokeyfile [PATHTOKEYFILE]
path to keyfile [required]
--keyname [KEYNAME] key name to use to access instances [required]
--username [USERNAME]
user to log into remote instances as [required]
--numinstances [NUM_INSTANCES]
number of instances to start
--instancetype [INSTANCE_TYPE]
type of instance to request
--imageid [IMAGEID] AWS image to use [required]
--spotprice [SPOTPRICE]
Spot price limit ($/hour/instance)
--region [REGION_NAME]
Region to use
Workers per instance
--sgid [SGID] Security Group ID [required]
--destroy Destroy the cluster

© 2019 Mike's Page

Theme by Anders NorenUp ↑