Название: Statistical Approaches for Hidden Variables in Ecology
Автор: Nathalie Peyrard
Издательство: John Wiley & Sons Limited
Жанр: Социология
isbn: 9781119902782
isbn:
Figure 1.1. The map at the top shows the tracking data for a male Cape dolphin (Cephalorhynchus heavisidii) in St. Helena Bay, South Africa. The coastline is shown in black, and we see that some recorded positions are actually on land. These positions are obtained using an Argos system. Figure taken from Elwen et al. (2006). Photo of a Cape dolphin by Jutta Luft, distributed under the GNU Free Documentation License. For a color version of this figure, see www.iste.co.uk/peyrard/ecology.zip
Observation errors are generally small (a few meters) in cases where positions are obtained using a GPS system on open ground and with good satellite coverage. Far larger errors may occur using other technologies, such as the Argos system (into the tens of kilometers). A hierarchical model for reconstructing real trajectories from observed trajectories is presented in section 1.2.1.
1.1.2. Identifying different behaviors in movement
Individuals rarely move in a homogeneous manner, and different movement patterns are often observed. In Nathan et al. (2008), the authors propose a formalization of the mechanisms responsible for individual movement. Among the different aspects mentioned, the internal state of the individual and the environment in which it exists are identified as important mechanisms of movement. It seems reasonable to believe that the internal state of an individual affects its behavior, resulting in a change of movement regime.
Any study of individual movement must permit the identification of different states or activities. In this case, the hidden variable is the activity of the individual, while the observed variable is its position, or various metrics derived from this position, as we shall see later. Section 1.2.2 presents a reconstruction of behavior based on movement observations, using a specific latent variable model known as a hidden Markov model.
1.2. Hierarchical models of movement
1.2.1. Trajectory reconstruction model
1.2.1.1. Overview
In cases where there are errors in observed positions, data can be smoothed in order to recreate the real trajectory. To smooth errors, all collected data points are combined with a movement model in order to “straighten out” outlying observations and thus correct positioning errors.
Different ways of taking account of observation errors in movement models have been discussed at length in the literature (Freitas et al. 2008; Johnson et al. 2008; Patterson et al. 2010). For initial, simple trajectory reconstructions, however, a linear Gaussian hierarchical model can be used as a first data exploration. This approach draws on the notion that the observed position is a noisy version of the real position, and that the noise around this position is Gaussian. In formal terms, take n noisy observations, y0:n = (y0, . . . , yn), of an animal’s position. Generally speaking (and throughout this chapter), we presume that each observed position is a vector of ℝ2. These observations are presumed to be realizations of random variables Y0:n, the distribution of which depends on the real position of the animal. Moreover, the real position of an animal at a given instant (unknown) is dependent on its real position for the previous instant (also unknown). In formal terms, these positions themselves can be seen as a sequence of non-independent random variables, noted Z0:n = (Z0, . . . Zn), with values in ℝ2.
We consider that all of these random variables obey the following hierarchical model:
From top to bottom, these three equations define:
– The initial distribution: the a priori initial position of the individual. In this case, we have a normal distribution (in dimension 2) about an initial position μ0, with a variance–covariance matrix Σ0.
– The transition distribution (or dynamic model): in this case, a model of the individual’s movement. We consider that the current position is given by a random Gaussian variable, centered about an affine transformation of the previous position, with a variance–covariance matrix Σm. The affine transformation is obtained from two parameters: a matrix A (of size 2 × 2) and a vector μ of dimension 2. The most common approach is to consider that μ = 0 and to take A as the identity matrix. The resulting model is a random walk.
– The emission distribution (or observation model): the observation is taken to be a random Gaussian variable centered about an affine transformation of the current position, with variance–covariance matrix Σo. The affine transformation is given by two parameters: a matrix B (of size 2 × 2) and a vector ν of dimension 2. The most common approach is to consider that ν = 0 and to take B as the identity matrix. The observation is thus presumed to be centered about the real position.
1.2.1.2. Inference
Using the model defined by [1.1], inference is used for two purposes:
– Estimation of positions: in this case, inference is used to determine the distribution of actual positions based on observations, that is, for 0 ≤ t ≤ n, the distribution of the random variable Zt|Y0:n. This distribution is known as the smoothing distribution.
– Estimation of parameters: to estimate the unknown parameters in the model (which, in the majority of cases, correspond to the two variance–covariance matrices, Σm and Σo).
With known parameters and for any 0 ≤ t ≤ n, the distribution of Zt|Y0:n is Gaussian. The mean and the variance–covariance matrix of this distribution can be calculated explicitly. This step is carried out using Kalman smoothing, which will not be described in detail here; interested readers may wish to consult Tusell (2011). It is important to note that the explicit nature of this solution is exceptional in the context of latent variable models, and is a result of the Gaussian linear formulation of model [1.1].
In practice, the parameter θ = {μ, A, ν, B, Σm, Σo} is unknown. In a frequentist context, the natural aim is to identify the parameter that maximizes the likelihood associated with observations Y0:n:
where p is a generic notation for probability density. In this case, the expression of likelihood implies the calculation of an integral in very high dimensions, as it must be integrated across all hidden states. However, given a known sequence of real positions X0:n, we would have an explicit expression of the full log-likelihood:
As all of the densities in this model are Gaussian, maximization of the log-likelihood would be simple. The expectation–maximization СКАЧАТЬ