Navigate to a specified target location using Bayesian Inference and RxInfer
Aerospace Industry
Bayesian Inference
Active Inference
RxInfer
Julia
Author
Kobus Esterhuysen
Published
April 6, 2024
Modified
September 26, 2024
Navigating a drone to a specified target is a commonly encountered problem. In the previous project an active inference agent placed a unit far enough from a radar source to become invisible. In this project, the idea is to send a drone to the location of the radar source to do reconnaissance.
0 Active Inference: Bridging Minds and Machines
In recent years, the landscape of machine learning has undergone a profound transformation with the emergence of active inference, a novel paradigm that draws inspiration from the principles of biological systems to inform intelligent decision-making processes. Unlike traditional approaches to machine learning, which often passively receive data and adjust internal parameters to optimize performance, active inference represents a dynamic and interactive framework where agents actively engage with their environment to gather information and make decisions in real-time.
At its core, active inference is rooted in the notion of agents as embodied entities situated within their environments, constantly interacting with and influencing their surroundings. This perspective mirrors the fundamental processes observed in living organisms, where perception, action, and cognition are deeply intertwined to facilitate adaptive behavior. By leveraging this holistic view of intelligence, active inference offers a unified framework that seamlessly integrates perception, decision-making, and action, thereby enabling agents to navigate complex and uncertain environments more effectively.
One of the defining features of active inference is its emphasis on the active acquisition of information. Rather than waiting passively for sensory inputs, agents proactively select actions that are expected to yield the most informative outcomes, thus guiding their interactions with the environment. This active exploration not only enables agents to reduce uncertainty and make more informed decisions but also allows them to actively shape their environments to better suit their goals and objectives.
Furthermore, active inference places a strong emphasis on the hierarchical organization of decision-making processes, recognizing that complex behaviors often emerge from the interaction of multiple levels of abstraction. At each level, agents engage in a continuous cycle of prediction, inference, and action, where higher-level representations guide lower-level processes while simultaneously being refined and updated based on incoming sensory information.
The applications of active inference span a wide range of domains, including robotics, autonomous systems, neuroscience, and cognitive science. In robotics, active inference offers a promising approach for developing robots that can adapt and learn in real-time, even in unpredictable and dynamic environments. In neuroscience and cognitive science, active inference provides a theoretical framework for understanding the computational principles underlying perception, action, and decision-making in biological systems.
In conclusion, active inference represents a paradigm shift in machine learning, offering a principled and unified framework for understanding and implementing intelligent behavior in artificial systems. By drawing inspiration from the principles of biological systems, active inference holds the promise of revolutionizing our approach to building intelligent machines and understanding the nature of intelligence itself.
1 BUSINESS UNDERSTANDING
An often encountered problem is to navigate a drone to a specified target. In the previous project Under the Radar with Active Inference, an active inference agent placed a unit just beyond the reach of an enemy radar source to hopefully become invisible. In this project, we will setup an active inference agent to send a drone to the location of the radar source to do reconnaissance. The operator of the drone will first maneuver it to a suitable hight. Then the agent will take over to guide it to the specified target (using a 2-dimensional approach for simplicity).
This problem is, of course, not limited to a military application. It is widely applicable - just think of the delivering of packages by logistics companies.
versioninfo() ## Julia version# VERSION ## Julia version
Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 12 × Intel(R) Core(TM) i7-8700B CPU @ 3.20GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 12 virtual cores)
Environment:
JULIA_NUM_THREADS =
Resolving package versions...
No Changes to `~/.julia/environments/v1.10/Project.toml`
No Changes to `~/.julia/environments/v1.10/Manifest.toml`
Resolving package versions...
No Changes to `~/.julia/environments/v1.10/Project.toml`
No Changes to `~/.julia/environments/v1.10/Manifest.toml`
Pkg.status()
Status `~/.julia/environments/v1.10/Project.toml`
[91a5bcdd] Plots v1.40.8
[86711068] RxInfer v3.6.0
2 DATA UNDERSTANDING
There is no pre-existing data to be analyzed.
3 DATA PREPARATION
There is no pre-existing data to be prepared.
4 MODELING
4.1 Narrative
Please review the narrative in section 1.
4.2 Core Elements
This section attempts to answer three important questions:
What metrics are we going to track?
What decisions do we intend to make?
What are the sources of uncertainty?
For this problem, we will only track:
the \(x\) and \(y\) position of the drone
the \(x\) and \(y\) components of the velocity of the drone
Decisions will be in the form of agent-prescribed turn actions.
The sources of uncertainty relating to the environment will be
the noise associated with transitioning to the next state (system/process noise)
the noise associated with an observation (measurement noise).
4.3 System-Under-Steer / Environment / Generative Process
The system-under-steer/environment/generative process is a drone with a 4-dimensional state vector:
\(x\) position
\(y\) position
angle of the velocity (in radians)
magnitude of the velocity
The drone will be steered by means of turn/yaw actions.
_s̃₀ = [-0.0, 0.0, -0.1, 3.0] ## initial state
4-element Vector{Float64}:
-0.0
0.0
-0.1
3.0
4.3.1 State variables
The state at time \(t\) of the system-under-steer (sustr), also referred to as the environment (envir), or the generative process (genpr) will be given by:
\[
\tilde{\mathbf{s}}_t = (x_t, y_t, v_{at}, v_{rt})
\] where
\(x\): x component of the position
\(y\): y component of the position
\(v_a\): angle of the velocity (in radians)
\(v_r\): magnitude of the velocity
## Function to find the updates in the x & y components due to the velocityfunctionAᵃ(s̃, δt) a =zeros(4) a[1] = s̃[4]*cos(s̃[3])*δt a[2] = s̃[4]*sin(s̃[3])*δtreturn aendAᵃ(_s̃₀, 1) ## x & y components of velocity
Decisions are in the form of turn/yaw actions which adjust the angle of the velocity. Actions are limited to the interval \((-F^{EngLimit}, F^{EngLimit})\). It is given by:
\[
\begin{align}
R^a &= F^{EngLimit} ⋅ \mathrm{tanh}(a) \\
&= 0.1 ⋅ \mathrm{tanh}(a)
\end{align}
\] where \(a_t\) is the action on the drone at time \(t\).
The objective function is such that the Bethe free energy is minimized. This aspect will be handled by the RxInfer Julia package.
4.3.6 Implementation of the System-Under-Steer / Environment / Generative Process
The agent and the environment interact through a Markov blanket. Because states of the agent are unknown to the world, we wrap them in a comprehension that only returns functions for interacting with the agent. Internal beliefs cannot be directly observed, and interaction is only allowed through the Markov blanket of the agent (i.e. the sensors and actuators).
As noted above, the sources of uncertainty relating to the environment will be:
the noise associated with transitioning to the next state (system/process noise)
the noise associated with an observation (measurement noise).
4.5 Agent / Generative Model
4.5.1 State variables
According to the agent the state of the system-under-steer/environment/generative process will be \(s_t\), rather than \(s̃_t\), which will then be given by
\[
\mathbf{s}_t = (x_t, y_t, v_{at}, v_{rt})
\]
4.5.2 Decision variables
According to the agent the action on the environment at time \(t\) will be represented by \(u_t\), also known as the control state of the agent.
4.5.3 Implementation of the Agent / Generative Model / Internal Model
We start by specifying a probabilistic model for the agent that describes the agent’s internal beliefs over the external dynamics of the environment.
To infer goal-driven (i.e. purposeful) behavior, we add prior beliefs \(p^+(\mathbf{x})\) about desired future observations. This leads to an extended agent model:
\[p(\mathbf{x}_k \mid \mathbf{s}_k) = \mathcal{N}(\mathbf{x}_k \mid \mathbf{s}_k,\,\mathbf\Theta)\] where \(\mathbf{x}_k = (\chi_{1k}, \chi_{2k}, ...)\) denotes observations of the agent after interacting with the environment.
This means we set a vague prior for the initial state.
4.5.3.1 Generative Model for the Drone
The code in the next block defines the agent’s internal beliefs over the external dynamics and its probabilistic model of the environment, which correspond accurately by directly using the functions defined above. We use the @model macro from RxInfer to define the probabilistic model and the meta block to define approximation methods for the nonlinear state-transition functions.
In the model specification we in addition to the current state of the agent we include the beliefs over its future states (up to T steps ahead):
@modelfunctiondronenav_model(mᵤ, Vᵤ, mₓ, Vₓ, mₛ₍ₜ₋₁₎, Vₛ₍ₜ₋₁₎, T, Rᵃ)## Transition function g = (sₜ₋₁::AbstractVector) ->begin sₜ =similar(sₜ₋₁) ## Next state sₜ =Aᵃ(sₜ₋₁, 1.0) + sₜ₋₁return sₜend## Function for modeling turn/yaw control h = (u::AbstractVector) ->Rᵃ(u[1]) Γ =_γ*diageye(4) ## Transition precision 𝚯 =_ϑ*diageye(4) ## Observation variance sₜ₋₁ ~MvNormal(mean=mₛ₍ₜ₋₁₎, cov=Vₛ₍ₜ₋₁₎) sₖ₋₁ = sₜ₋₁local s## subtract t-1 from both sides of range k = t : t+(T-1)## k used for future times in series, i used for sets## T used for number of time steps in lookahead time horizonfor k in1:T## Control u[k] ~MvNormal(mean=mᵤ[k], cov=Vᵤ[k]) hIuI[k] ~h(u[k]) where { meta=DeltaMeta(method=Unscented()) }## State transition gIsI[k] ~g(sₖ₋₁) where { meta=DeltaMeta(method=Unscented()) } ghSum[k] ~ gIsI[k] + hIuI[k]#. s[k] ~MvNormal(mean=ghSum[k], precision=Γ)## Likelihood of future observations x[k] ~MvNormal(mean=s[k], cov=𝚯)## Target/Goal prior x[k] ~MvNormal(mean=mₓ[k], cov=Vₓ[k]) sₖ₋₁ = s[k]endreturn (s, )end
Next, we define the agent:
functioncreate_agent(; T=20, Rᵃ, x₊, s₀, ξ=0.1, σ=1e-4)## Set control priors Ξ =fill(ξ, 1, 1) ##Control prior variance mᵤ =Vector{Float64}[ [0.0] for k=1:T ] ##Set control priors Vᵤ =Matrix{Float64}[ Ξ for k=1:T ]## Set target/goal priors Σ =σ*diageye(4) ##Target/Goal prior variance Σ[3, 3] =1e4 Σ[4, 4] =1e4 mₓ = [zeros(4) for k=1:T] mₓ[end] = x₊ ##Set prior mean to reach target/goal at t=T Vₓ = [huge*diageye(4) for k=1:T] Vₓ[end] = Σ ##Set prior variance to reach target/goal at t=T## Set initial brain state prior mₛ₍ₜ₋₁₎ = s₀ Vₛ₍ₜ₋₁₎ =tiny*diageye(4)## Set current inference results result =nothing## Bayesian inference by message passing## The `infer` function is the heart of the agent## It calls the `RxInfer.infer` function to perform Bayesian inference by message passing compute = (υₜ::Float64, ŷₜ::Vector{Float64}) ->begin mᵤ[1] = [υₜ] ## Register action with the generative model Vᵤ[1] =fill(tiny, 1, 1) ## Clamp control prior to performed action mₓ[1] = ŷₜ ## Register observation with the generative model Vₓ[1] =tiny*diageye(4) ## Clamp target/goal prior to observation result =infer( model=dronenav_model(T=T, Rᵃ=Rᵃ), data=Dict(:mᵤ => mᵤ, :Vᵤ => Vᵤ, :mₓ => mₓ, :Vₓ => Vₓ,:mₛ₍ₜ₋₁₎ => mₛ₍ₜ₋₁₎,:Vₛ₍ₜ₋₁₎ => Vₛ₍ₜ₋₁₎))end## The `act` function returns the inferred best possible action act = () ->beginif result !==nothingreturnmode(result.posteriors[:u][3])[1]elsereturn0.0## Without inference result we return some 'random' actionendend## The `future` function returns the inferred future states future = () ->beginif result !==nothingreturngetindex.(mode.(result.posteriors[:s]), 1)elsereturnzeros(T)endend## The `slide` function modifies the `(mₛ₍ₜ₋₁₎, Vₛ₍ₜ₋₁₎` for the next step## and shifts (or slides) the array of future goals `(mₓ, Vₓ)` ## and inferred actions `(mᵤ, Vᵤ)` slide = () ->begin model = RxInfer.getmodel(result.model) (s, ) = RxInfer.getreturnval(model) varref = RxInfer.getvarref(model, s) var = RxInfer.getvariable(varref) slide_msg_idx =3## This index is model dependent (mₛ₍ₜ₋₁₎, Vₛ₍ₜ₋₁₎) =mean_cov(getrecent(messageout(var[2], slide_msg_idx))) mᵤ =circshift(mᵤ, -1) mᵤ[end] = [0.0] Vᵤ =circshift(Vᵤ, -1) Vᵤ[end] = Ξ mₓ =circshift(mₓ, -1) mₓ[end] = x₊ Vₓ =circshift(Vₓ, -1) Vₓ[end] = Σendreturn (act, future, compute, slide)end
create_agent (generic function with 1 method)
4.6 Agent Policy Evaluation
4.6.1 Training/Tuning
4.6.1.1 Naive approach
In this simulation we are going to perform a naive action policy for a tight full-turn only. In this case the agent should not be able to achieve its goal:
_Nⁿᵃⁱᵛᵉ =100## Total simulation time_πⁿᵃⁱᵛᵉ =-0.1## Naive policy for full right turn action only_s̃₀ = [8.0, 8.0, -0.1, 0.1](execute_naive, observe_naive) =create_envir(; ## Let there be a world Rᵃ=Rᵃ, s̃₀=_s̃₀);_yⁿᵃⁱᵛᵉ =Vector{Vector{Float64}}(undef, _Nⁿᵃⁱᵛᵉ)for t =1:_Nⁿᵃⁱᵛᵉexecute_naive(_πⁿᵃⁱᵛᵉ) ## Execute environmental process _yⁿᵃⁱᵛᵉ[t] =observe_naive() ## Observe external statesend
In the active inference approach we are going to create an agent that models the environment around itself as well as the best possible actions in a probabilistic manner.
### Simulation parameters## Total simulation time_Nᵃⁱ =200## Total simulation time## Lookahead time horizon_Tᵃⁱ =100## Initial state_s₀ = [8.0, 8.0, -0.1, 0.1]## Control prior variance value_ξ =1.0## Target prior variance value_σ =1e-6## Target/Goal state_x₊ = [0.0, 0.0, 0.0*π, 0.1]
4-element Vector{Float64}:
0.0
0.0
0.0
0.1
(execute_ai, observe_ai) =create_envir(; ## Let there be a world Rᵃ=Rᵃ, s̃₀=_s₀)(act_ai, future_ai, compute_ai, slide_ai) =create_agent(; ## Let there be an agent T =_Tᵃⁱ, Rᵃ=Rᵃ, x₊=_x₊, s₀=_s₀, ξ = _ξ, σ = _σ) ## Step through experimental protocol_as =Vector{Float64}(undef, _Nᵃⁱ) ## Actions_fs =Vector{Vector{Float64}}(undef, _Nᵃⁱ) ## Predicted future_ys =Vector{Vector{Float64}}(undef, _Nᵃⁱ) ## Observations## t used for general times in series## N used for number of time steps in experiment## for each t=1:N there are k=1:T future time steps (in _fs)for t =1:_Nᵃⁱ ## 1. Act-Execute-Observe: #.execute() & observe() from create_envir() _as[t] =act_ai() ## Invoke an action from the agent _fs[t] =future_ai() ## Fetch the predicted future statesexecute_ai(_as[t]) ## The action influences hidden external states _ys[t] =observe_ai() ## Observe the current environmental outcome (update p)## 2. Infer:compute_ai(_as[t], _ys[t]) ## Infer beliefs from current model state (update q)## 3. Slide:slide_ai() ## Prepare for next iterationend