adkane
adkane

Reputation: 1441

How to correlate data with factor levels?

I want to relate some behaviours, coded as factors, to a continuous covariate. The underlying motivation is an animal changing its behaviour from searching (behaviour 1) to feeding (behaviour 2) as it gets closer to the covariate (distance to food say).

Thus, the covariate should be big (large distance to food) when the animal is in behaviour 1 and get smaller as it approaches behaviour 2 and while it is in this state (short distance to food). One wrinkle is that I have multiple animals.

The data I have look something like this:

animalID behaviour 
1         1
1         1      
1         1
1         2
1         2
1         2
1         1
1         1
2         1
2         1
2         1
2         2
2         2
2         2
2         1

and I want something like this

animalID behaviour distance
1         1          100
1         1           99
1         1           98
1         2           58
1         2           57
1         2           60
1         1           74
1         1           75
2         1           104
2         1           101
2         1           100
2         2           40
2         2           44
2         2           42
2         1           86

Upvotes: 0

Views: 55

Answers (1)

Sam Mason
Sam Mason

Reputation: 16174

given that you don't have any covariates there isn't much to go with. simplest way of doing something would just be to use a moving average and transform as appropriate

if you do have some covariates to use and wanted to do something much more complicated, then you could use a randomised/monte-carlo method. the Stan language lets you easily define and sample from Bayesian models. in this case you could define a simple autoregressive model:

data {
  int<lower=0> N;  // number of data points
  int<lower=0> animal[N];
  real behaviour[N];
}
parameters {
  real mu[N]; // the values you care about
  real<lower=0> sigma_auto;  // autocorrelation of values
  real<lower=0> sigma_behaviour;  // how close they should be to data
}
model {
  for (i in 2:N) {
    if (animal[i] == animal[i-1]) {
      // autoregressive component of model
      mu[i] ~ normal(mu[i-1], sigma_auto);
    }
  }
  // comparison to data
  behaviour ~ normal(mu, sigma_behaviour);
  // priors
  sigma_auto ~ cauchy(0, 0.05);
  sigma_behaviour ~ cauchy(0, 0.05);
}

the code is a bit like R, but I'd recommend reading the manual. you can run it by doing:

library(rstan)

df = read.table(text="animalID behaviour 
1         1
...
", header=TRUE)

fit <- stan("model.stan", iter=1000, data=list(
    N=nrow(df),
    animal=df$animalID,
    behaviour=df$behaviour
))

plot(df$behaviour)
mu <- extract(fit, 'mu')$mu
for (i in 1:nrow(mu)) {
    lines(mu[i,], lwd=0.2)
}

the stan call compiles the model (via a C++ compiler) and runs it for iter samples. the extract line pulls samples of mu out the posterior and then I plot it over the data.

hope that helps!

Upvotes: 1

Related Questions