Control a Roomba using Active Inference and RxInfer
Bayesian Inference
Active Inference
RxInfer
Julia
Author
Kobus Esterhuysen
Published
May 20, 2024
Modified
September 6, 2024
Hidden Markov Model with Control (Part 2)
1 BUSINESS UNDERSTANDING
This is the second part of an effort to add control to the Hidden Markov Model RxInfer example. Eventually this work will be used in a more elaborate scheduling project making use of active inference.
In this part, we will add control inputs to the Roomba, instructing it to move to a specific room. The 3 rooms are the master bedroom, the living room, and the bathroom. In addition, we add the act() function in the structure below:
create_envir()
execute()
observe()
create_agent()
act() [added in this notebook]
future() [not yet]
compute()
slide() [not yet]
In the compute() function, which takes care of inference, we will still estimate the \(A\) matrix. However, we will not estimate the \(B\) matrix (as was done in Part 1). Instead we will make use of a fixed \(B\)-matrix tensor. This tensor can be considered a stack of 3 \(B\)-matrices - each associated with a specific control action to go to a specific room.
Unfortunately, the implementation broke during experimentation. Hopefully this will be fixed soon. It was important to publish this effort (even with the breakage) to allow for assistance in context.
versioninfo() ## Julia version
Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 12 × Intel(R) Core(TM) i7-8700B CPU @ 3.20GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 12 virtual cores)
Environment:
JULIA_NUM_THREADS =
importPkgPkg.add(Pkg.PackageSpec(;name="RxInfer", version="3.6.0"))Pkg.add(Pkg.PackageSpec(;name="Random"))Pkg.add(Pkg.PackageSpec(;name="Plots", version="1.40.1"))# Pkg.add(Pkg.PackageSpec(;name="XLSX"))# Pkg.add(Pkg.PackageSpec(;name="DataFrames"))# Pkg.add(Pkg.PackageSpec(;name="BenchmarkTools"))# Pkg.add(Pkg.PackageSpec(;name="Distributions"))# using RxInfer, Random, BenchmarkTools, Distributions, Plots, XLSX, DataFramesusingRxInfer, Plots, Random
Resolving package versions...
No Changes to `~/.julia/environments/v1.10/Project.toml`
No Changes to `~/.julia/environments/v1.10/Manifest.toml`
Resolving package versions...
No Changes to `~/.julia/environments/v1.10/Project.toml`
No Changes to `~/.julia/environments/v1.10/Manifest.toml`
Resolving package versions...
No Changes to `~/.julia/environments/v1.10/Project.toml`
No Changes to `~/.julia/environments/v1.10/Manifest.toml`
Pkg.status()
Status `~/.julia/environments/v1.10/Project.toml`
[6e4b80f9] BenchmarkTools v1.5.0
[31c24e10] Distributions v0.25.111
⌃ [91a5bcdd] Plots v1.40.1
[86711068] RxInfer v3.6.0
[9a3f8284] Random
Info Packages marked with ⌃ have new versions available and may be upgradable.
2 DATA UNDERSTANDING
The data will consist of a time-series of 1-hot encoded position states. When the associated component of the state vector is a 1, the Roomba will be present in the associated room.
3 DATA PREPARATION
We will use the data from the simulation directly. There is no need to perform additional data preparation.
4 MODELING
4.1 Narrative
The next figure (from Dr. Bert de Vries at Eindhoven University of Technology) shows the interactions between the agent and the environment:
The grey area shows the markov blanket of the agent. The interactions between the agent and environment can be summarized by:
This means that actions on the environment are sampled from the posterior over control signals. We will explain a bit more down below.
In general, we will have the following symbol conventions:
\(t\): time
\(k\): future time
\(l\): past time
\(x_t\): scalar random variable
\(\boldsymbol{x}_t\): sequence of scalar random variables
\(\mathbf{x}_t\): 1-hot encoded random variable
Global code variables will always be prefixed with an underscore ’_’.
4.2 Core Elements
This section attempts to answer three important questions:
What metrics are we going to track?
What decisions do we intend to make?
What are the sources of uncertainty?
The only metric we are interested in is the postition of the Roomba, i.e. in which of the 3 rooms it finds itself.
There are no control/steering decisions to be made (yet). We are simply interested in the behavior of the Roomba.
There are two sources of uncertainty. The first has to do with the fact that the state transitions are not deterministic but rather stochastic. This will be captured in the transition matrix \(B\). The second relates to observations. The Roomba does not always measure accurately the floor surface in a room. This uncertainty will be captured in the observation matrix \(A\).
4.3 System-Under-Steer / Environment / Generative Process
First, in order to track the Roomba’s movements using RxInfer, we need to come up with a model. Since we have a discrete set of rooms in the apartment, we can use a categorical distribution to represent the Roomba’s position. There are three rooms in the apartment, meaning we need three states in our categorical distribution. At time \(t\), let’s call the Roomba’s true position \(\tilde{s}_t\).
However, we also know that some rooms are more accessible than others, meaning the Roomba is more likely to move between these rooms - for example, it’s rare to have a door directly between the bathroom and the master bedroom. We can encode this information using a transition matrix, which we will call \(B\).
Our Roomba is equipped with a small camera that tracks the surface it is moving over. We will use this camera to obtain our observations since we know that there is
hardwood floors in the master bedroom
carpet in the living room, and
tiles in the bathroom.
However, this method is not foolproof, and sometimes the Roomba will make mistakes and mistake the hardwood floor for tiles or the carpet for hardwood. Don’t be too hard on the little guy, it’s just a Roomba after all.
At time \(t\), we will call our observations \(y_t\) and encode the mapping from the Roomba’s position to the observations in a matrix we call \(A\). \(A\) also encodes the likelihood that the Roomba will make a mistake and get the wrong observation. This leaves us with the following model specification:
This type of discrete state space model is known as a Hidden Markov Model or HMM for short. Our goal is to learn the matrices \(A\) and \(B\) so we can use them to track the whereabouts of our little cleaning agent.
_seed =1909_rng =MersenneTwister(_seed)
MersenneTwister(1909)
_s̃₀ = [1.0, 0.0, 0.0] ## initial state
3-element Vector{Float64}:
1.0
0.0
0.0
4.3.1 State and Observation variables
The state variables represent what we need to know. The true state at time \(t\) of the Roomba will be given by a 1-hot encoded vector \[\mathbf{\tilde{s}}_t= (\mathcal{S})\]
There are no decisions to be made (yet). We are simply observing the behavior of the Roomba, i.e. there is no control/steering applied to the environment.
4.3.3 Exogenous information / Autonomous variables
There are no exogenous information variables (yet). We may add some in followup parts, for example, having certain doors closed at random.
\(w_t\) is the exogenous information / autonomous state [not yet]
\(y_t\) is the outcome or observation
Transition model \(\tilde{B}\)
In Part 1, we had this transition model:
states \(\tilde{s}_{t-1}\) states \(\tilde{s}_t\)
bed
living
bath
bed
0.9
0.0
0.1
living
0.0
0.9
0.1
bath
0.1
0.1
0.8
In this notebook, Part 2, we setup the following arrangement of rooms:
1 bed
2 liv
3 bath
Firthermore, there are doors between bed and liv, and between bed and bath. For now, these doors are all permanently open. Note that there is no door between bath and liv. For this notebook, if the action is to move from bath to liv, there will be a 50% chance to stay where it is, and a 50% chance to move to bed instead.
To capture the transition model \(\mathrm{B}_{u_k}\), we setup 3 matrices, one for each of the control actions \(aₜ ∈ \{ 1, 2, 3\}\):
Go to bedroom:
\(a_{t}=1\)
1
2
3
1
1.0
1.0
1.0
2
0.0
0.0
0.0
3
0.0
0.0
0.0
Go to livingroom:
\(a_{t}=2\)
1
2
3
1
0.0
0.0
0.5
2
1.0
1.0
0.0
3
0.0
0.0
0.5
Go to bathroom:
\(a_{t}=3\)
1
2
3
1
0.0
0.5
0.0
2
0.0
0.5
0.0
3
1.0
0.0
1.0
_B̃ = [ [1. 1. 1.; ## go to bedroom0. 0. 0.;0. 0. 0.], [0. 0. .5; ## go to livingroom1. 1. 0.;0. 0. .5], [0. .50.; ## go to bathroom0. .50.;1. 0. 1.]]
The objective function is such that the Bethe free energy (BFE) or Generalized free energy (GFE) is minimized. This aspect will be handled by the RxInfer Julia package.
4.3.6 Implementation of the System-Under-Steer / Environment / Generative Process
## returns a one-hot encoding of a random sample from a categorical distributionfunctionrand_1hot_vec(rng, distribution::Categorical) K =ncategories(distribution) sample =zeros(K) drawn_category =rand(rng, distribution) sample[drawn_category] =1.0return sampleend
rand_1hot_vec (generic function with 1 method)
The execute function now gets an argument \(a_t\) which is the action to execute on the environment.
In order to generate data to mimic the observations of the Roomba, we need to specify two things: the actual transition probabilities between the states (i.e., how likely is the Roomba to move from one room to another), and the observation distribution (i.e., what type of texture will the Roomba encounter in each room). We can then use these specifications to generate observations from our hidden Markov model (HMM).
To generate our observation data, we’ll follow these steps: 1. Assume an initial state for the Roomba. For example, we can start the Roomba in the bedroom. 2. Determine where the Roomba went next by drawing from a Categorical distribution with the transition probabilities between the different rooms. 3. Determine the observation encountered in this room by drawing from a Categorical distribution with the corresponding observation probabilities. 4. Repeat steps 2-3 for as many samples as we want.
The following code implements this process and generates our observation data:
We will generate 600 data points to simulate 600 ticks of the Roomba moving through the apartment. _ys will contain the Roomba’s measurements of the floor it’s currently on, and _s̃s will contain information on the room the Roomba was actually in.
We will use random control actions, drawn from a uniform distribution.
On the agent’s side, the state of the environment at time \(t\) will be given by a 1-hot encoded vector \(\mathbf s_t\). The initial state prior is given by \[
p(\mathbf s_{0}) = \mathcal{Cat}(\mathbf s_{0} ∣ \mathbf d)
\]
where \(\mathbf d\) parameterizes the categorical distribution of \(\mathbf s_0\).
The observation made by the Roomba at time \(t\) will be given by \(\mathbf x_t\) where
\(\mathbf{x}_t = [1, 0, 0]^T\): hardwood
\(\mathbf{x}_t = [0, 1, 0]^T\): carpet
\(\mathbf{x}_t = [0, 0, 1]^T\): tiles
4.5.2 Decision variables
On the agent’s side, the action on the environment at time \(t\) will be represented by a 1-hot encoded vector \(\mathbf u_t\). The control prior is given by \[
p(\mathbf u_k) = \mathcal{Cat}(\mathbf u_k ∣ \mathbf e_k)
\]
where \(\mathbf e_k\) parameterizes the categorical distribution of \(\mathbf u_k\).
\(\mathbf B_{u_k} \mathbf s_{k-1}\) parameterizes the categorical distribution of \(\mathbf s_k\), and \(\mathbf u_{κ k}\) selects the associated categorical distribution for the specific control action.
\(\mathbf A \mathbf s_k\) parameterizes the categorical distribution of \(\mathbf x_k\).
An entry in \(A\) captures the probability of a specific observation given a specific state. Each column in \(A\) contains a categorical distribution. A specific column is selected by multiplying with \(\mathbf s\).
4.5.5 Implementation of the Agent / Generative Model / Internal Model
We start by specifying a probabilistic model for the agent that describes the agent’s internal beliefs over the external dynamics of the environment. Assuming the current time is \(t\),
To infer goal-driven (i.e. purposeful) behavior, we add prior beliefs \(p^+(x)\) about desired future observations$. This leads to an extended agent model:
In general, if \(\mathbf a\) is a 1-hot encoded random variable, and has a categorical (aka multinoulli) distribution, then
\[p(\mathbf a ∣ \boldsymbol{\rho}) = \mathcal{Cat}(\mathbf a ∣ \boldsymbol \rho) = \prod_{i} \rho_i^{a_i}\]
This means the \(i\) th component of vector \(\mathbf a\) selects the \(i\) th component of the probability vector \(\boldsymbol \rho\) of the distribution.
If the probability vector is \(\boldsymbol{\rho} = \begin{bmatrix} 0.05 \\ 0.05 \\ 0.50 \\ 0.10 \\ 0.10 \\ 0.20 \end{bmatrix}\) and the random variable \(\mathbf a\) is \(\mathbf{a} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 1 \\ 0 \\ 0 \end{bmatrix}\)
Now it is time to build our model. As mentioned earlier, we will use Categorical distributions for the states and observations. To learn the \(A\) matrix we can use a MatrixDirichlet prior.
As for the observations, we have good reason to trust our Roomba’s measurements. To represent this, we will add large values to the diagonal of our prior on \(A\). However, we also acknowledge that the Roomba is not infallible, so we will add some noise on the off-diagonal entries.
Since we will use Variational Inference, we also have to specify inference constraints. We will use a structured variational approximation to the true posterior distribution, where we decouple the variational posterior over the states (q(s_0, s)) from the posterior over the transition matrices (q(A)). This dependency decoupling in the approximate posterior distribution ensures that inference is tractable. Let’s build the model!
@modelfunctionhidden_markov_model(x, B, T)# @model function hidden_markov_model(x) ## B ~ MatrixDirichlet(ones(3, 3)) A ~MatrixDirichlet([10.01.01.0; 1.010.01.0; 1.01.010.0 ]) d =fill(1.0/3.0, 3) #. s₀ ~Categorical(d) #.e=fill(1. /3., 3) #. sₖ₋₁ = s₀## for k in eachindex(x)for k in1:T dist =Categorical(e) u[k] ~ dist i =rand(_rng, dist) s[k] ~Transition(sₖ₋₁, B[i]) x[k] ~Transition(s[k], A) sₖ₋₁ = s[k]endend
@constraintsfunctionhidden_markov_model_constraints()q(s₀, s, A) =q(s₀, s)q(A)end## callbacks functionbefore_model_creation()println("The model is about to be created")endfunctionafter_model_creation(model::ProbabilisticModel)println("The model has been created")println(" The number of factor nodes is: ", length(RxInfer.getfactornodes(model)))println(" The number of latent states is: ", length(RxInfer.getrandomvars(model)))println(" The number of data points is: ", length(RxInfer.getdatavars(model)))println(" The number of constants is: ", length(RxInfer.getconstantvars(model)))endfunctionbefore_inference(model::ProbabilisticModel)println("The inference procedure is about to start")endfunctionafter_inference(model::ProbabilisticModel)println("The inference procedure has ended")endfunctionbefore_iteration(model::ProbabilisticModel, iteration::Int)println("The iteration ", iteration, " is about to start")endfunctionafter_iteration(model::ProbabilisticModel, iteration::Int)println("The iteration ", iteration, " has ended")endfunctionbefore_data_update(model::ProbabilisticModel, data)println("The data is about to be processed")endfunctionafter_data_update(model::ProbabilisticModel, data)println("The data has been processed")endfunctionon_marginal_update(model::ProbabilisticModel, name, update)## println("New marginal update for ", name, " ", update)println("New marginal update for ", name) #.end
on_marginal_update (generic function with 1 method)
Next, we define the agent and the time-stepping procedure.
model = GraphPPL.create_model(mymodel)# model = RxInfer.create_model(mymodel)
ErrorException: Model hidden_markov_model is not defined for 2 interfaces ((:B, :T)).
# \\\\\\\\\\\\
functioncreate_agent(; T, s₀, ys) compute = (aₜ::Vector{Float64}, ŷₜ::Vector{Float64}) ->begin imarginals =@initializationbeginq(A) =vague(MatrixDirichlet, 3, 3)## q(B) = vague(MatrixDirichlet, 3, 3)q(s) =vague(Categorical, 3)q(u) =vague(Categorical, 3)end result =infer(## model=hidden_markov_model(), model=hidden_markov_model(B=_B̃[argmax(aₜ)], T=T), data=(x=ys,), constraints=hidden_markov_model_constraints(), initialization=imarginals, returnvars = ( A=KeepLast(), s=KeepLast()), iterations =_its, free_energy =true,## callbacks = (# before_model_creation = before_model_creation,# after_model_creation = after_model_creation,# before_inference = before_inference,# after_inference = after_inference,# before_iteration = before_iteration,# after_iteration = after_iteration,# before_data_update = before_data_update,# after_data_update = after_data_update,# on_marginal_update = on_marginal_update) )end act = () ->begin## if result !== nothing## uₜ = mode(result.posteriors[:u][3])[1]e=fill(1. /3., 3) uₜ =rand_1hot_vec(_rng, Categorical(e))return uₜ## else## return 0.0 ## Without inference result we return some 'random' action## once proper inference of u happens, put the 'if-part' here ## uₜ = [1, 0, 0]## return uₜ ##for now, just always go to the bedroom## endendreturn (act, compute)end
create_agent (generic function with 1 method)
4.6 Agent Evaluation
Now it’s time to perform inference and find out where the Roomba went in our absence. Did it remember to clean the bathroom?
We’ll be using Variational Inference to perform inference, which means we need to set some initial marginals as a starting point. RxInfer makes this easy with the vague function, which provides an uninformative guess. If you have better ideas, you can try a different initial guess and see what happens.
Since we’re only interested in the final result - the best guess about the Roomba’s position - we’ll only keep the last results. Let’s start the inference process!
4.6.1 Evaluate with simulated data
4.6.1.1 Naive approach {N/A}
4.6.1.2 Active inference approach
### Simulation parameters## Total simulation time_N =600## Lookahead time horizon## _T = 50 ## Lookahead time horizon## Initial state_s₀ = [1.0, 0.0, 0.0]## Iterations_its =20
20
### OVERRIDES
(execute_ai, observe_ai) =create_envir(; Ã=_Ã, B̃=_B̃, s̃₀=_s₀)##- _ys = Vector{Vector{Float64}}(undef, _N) ## Observations##- ISSUE: UndefRefError (SG)_ys =Vector{Float64}[ [0.0, 0.0, 0.0] for _ =1:_N ](act_ai, compute_ai) =create_agent(; T=_N, s₀=_s₀, ys=_ys ##used by compute() for infer()'s data)## Step through experimental protocol_as =Vector{Vector{Float64}}(undef, _N) ##. Actions_ss =Vector{Vector{Float64}}(undef, _N) ## Statesfor t =1:_N _as[t] =act_ai() ##. Invoke an action from the agent _ss[t] =execute_ai(_as[t]) ##. The action influences hidden external states _ys[t] =observe_ai() ## Observe the current environmental outcome (update p)compute_ai(_as[t], _ys[t]) #.end
ErrorException: Half-edge has been found: u_7. To terminate half-edges 'Uninformative' node can be used.
Finally, we can check if we were successful in keeping tabs on our Roomba’s whereabouts. We can also check if our model has converged by looking at the Free Energy.