# Mike's Page

### Informatics, Development, Cycling, Data, Travel ...

#### Month: October 2017

Note for self:

Just type

debug

in the next cell. (u = up, c = continue)

Use:

import pdb; pdb.set_trace()

A linear model with GPy.

Background: I’m working on a project aiming to extrapolate dialysis patient results over a 100-hour window (I’ll write future blog posts on this!). I’m working with James Fotheringham (Consultant Nephrologist) who brilliantly bridges the gap between clinic and research – allowing ivory-tower researchers (me) to get our expertise applied in the real world to useful applications.

Part of this project is the prediction of the patient’s weight. We’ll consider a simple model.

When a patient has haemodialysis, they will have fluid removed. If this doesn’t happen frequently or successfully enough the patient can experience *fluid overload*. Each dialysis session (in this simplified model) brings the patient’s weight back down to their dry-weight. Then this weight will increase (roughly linearly) as time goes by until their next dialysis session. I model their weight $$w(t,s)$$ with a slow-moving function of time-since-start-of-trial, $$f(t)$$, added to a linear function of time-since-last-dialysis-session, $$g(s)$$:

$$w(t,s) = f(t) + g(s)$$

For now we’ll ignore f, and just consider g. This is a simple linear function $$g(s) = s*w$$. The gradient $$w$$ describes how much their weight increases for each day it’s been since dialysis.

We could estimate this from previous data from that patient (e.g. by fitting a Gaussian Process model to the data). But if the patient is new, we might want to provide a prior distribution on this parameter. We could get this by considering what gradient other similar patients have (e.g. those with the same age, vintage, gender, weight, comorbidity, etc might have a similar gradient).

Side note: Colleagues have recommended that we combine all the patients into one large coregionalised model. This has several problems: excessive computational demands, privacy (having to share the data), interpretability (the gradient might be a useful feature), etc.

Other side note: I plan on fitting another model, of gradients wrt patient demographics etc to make predictions for the priors of the patients.

So our model is: $$g(s) = \phi(s)w$$ where we have a prior on $$w \sim N(\mu_p, \sigma_p^2)$$.

If we find the mean and covariance of $$g(s)$$:

$$E[g(s)] = \phi(s) E[w] = \phi(s) \mu_p$$
$$E[g(s)g(s’)] = \phi(s) E[ww^\top] \phi(s’) = \phi(s) (\mu_p \mu_p^\top + \Sigma_p) \phi(s’)$$

It seems a simple enough problem – but it does require than the prior mean is no longer zero (or flat wrt the inputs). I’m not sure how to do that in GPy.

Update: There is no way of doing this in GPy by default. So I solved this by adding a single point, with a crafted noise, to produce or ‘induce’ the prior we want.

I’ve written this into a simple python function (sorry it’s not on pip, but it seems a little too specialised to pollute python module namespace with). Download from <a href=”https://github.com/lionfish0/experimentation/blob/master/linear_kernel_prior.py”>github</a>. The docstring should explain how to use it (with an example).

My talk at DSA2017 and the website (with realtime predictions of Kampala’s air quality)

By default it seems GPy doesn’t subtract the mean of the data prior to fitting.

So it’s worth including a mean function that does this:

 m = GPy.models.GPRegression(
X,
y,
mean_function=GPy.mappings.Constant(
input_dim=1,output_dim=1,value=np.mean(y))
)

Then one needs to fix this value:

m.mean_function.C.fix()