Core Mathematical Framework
This demonstration models biological learning through the Coherence-Rupture-Regeneration (CRR) framework. Unlike the spatial navigation demonstration, this focuses on how organisms accumulate experience, undergo learning breakthroughs, and reconstruct behaviour based on weighted historical memory.
C(x,t) = ∫ L(x,τ) dτ
(Coherence: Accumulated learning through experience density)
Coherence C represents the fish's total accumulated learning. Each experience contributes a learning density L(x,τ) that integrates over time. High-stakes events (predator encounters) contribute more to L than routine exploration, creating differential memory weighting.
L(x,τ) = φ₊ - φ₋
(Memory density: Positive learning events minus stressful penalties)
The memory density captures the net learning rate at each moment. Predator proximity generates strong positive learning signals (φ₊ > 2.5), while energy depletion creates mild stress penalties (φ₋). Food discoveries provide moderate positive reinforcement.
δ(t-t₀) when C(x) > C_threshold
(Rupture: Discontinuous reorganisation at critical coherence)
When accumulated coherence exceeds threshold (~115 in this implementation, approximately Ω·ln(10)), the system undergoes rupture—a learning breakthrough. Coherence resets to 35% of its previous value, but critically, the fish's behaviour has reorganised. This models phenomena like insight learning or phase transitions in skill acquisition.
R[χ](x,t) = ∫ φ(x,τ)·e^(C(x,τ)/Ω)·Θ(t-τ) dτ
(Regeneration: Exponentially-weighted memory reconstruction)
Behaviour is continuously reconstructed through the regeneration operator R. Past experiences φ(x,τ) are weighted by accumulated coherence through the exponential term e^(C/Ω), where Ω = 50 acts as a temperature parameter. The Heaviside function Θ(t-τ) enforces causality—only the past influences the present, creating temporal asymmetry characteristic of biological systems.
From Markovian to Non-Markovian Behaviour
The fish begins with purely Markovian behaviour—current actions depend only on immediate sensory input. However, as coherence accumulates, the system becomes increasingly non-Markovian:
- C ≈ 0 (Early behaviour): Random exploration, e^(C/Ω) ≈ 1, minimal memory influence
- C ≈ 50 (Intermediate): e^(C/Ω) ≈ 2.7, past experiences begin shaping decisions
- C > 100 (Advanced): e^(C/Ω) > 7.4, strong memory weighting guides sophisticated avoidance and approach behaviours
The intelligence parameter I = tanh(C/Ω) captures this transition, approaching 1 as the fish becomes fully history-aware. This creates a smooth continuum from reactive (Markovian) to anticipatory (non-Markovian) behaviour.
Biological Plausibility
This mathematical structure captures several phenomena observed in biological learning:
- Differential encoding: High-stakes events (predation risk) create stronger memories than neutral experiences
- Consolidation: The exponential weighting e^(C/Ω) mirrors how neural representations strengthen with repeated activation
- Insight learning: Rupture events model sudden behavioural reorganisations documented in animal cognition
- Temporal decay: The regeneration operator includes temporal decay (not shown explicitly but implemented as exp(-α·Δt)) matching synaptic weakening
- Non-stationarity: Unlike fixed reward functions in RL, learning rate and behavioural policy evolve continuously
Comparison with Reinforcement Learning
Traditional RL approaches differ fundamentally from CRR dynamics:
- CRR has no reward function: Behaviour emerges from coherence accumulation rather than reward maximisation
- No value function: The fish doesn't estimate future returns; decisions arise from memory-weighted gradients
- No policy optimisation: There is no fixed policy being improved—behaviour continuously reconstructs through R[χ]
- Non-stationary learning: The effective "learning rate" scales with I = tanh(C/Ω), creating adaptive temporal structure
- Rupture as feature: Discontinuous reorganisations are essential, not pathological
Observable Phenomena
Watch for these emergent behaviours as coherence accumulates:
- Improved avoidance: Initially, the fish may swim towards predators. As C increases, avoidance becomes more reliable and occurs at greater distances
- Directed foraging: Early food discoveries are accidental. With higher C, the fish begins seeking food more deliberately
- Post-rupture refinement: After rupture, behaviour often becomes more efficient—the fish has "learnt how to learn"
- Coherence field radius: The purple glow visualises accumulated coherence spatially, showing how learning becomes embedded in the environment
- Memory vectors: Pink arrows show regeneration influences—past high-value locations pull current behaviour
Parameter Values
This implementation uses the following constants:
- Ω = 50 (Temperature parameter controlling memory weighting sensitivity)
- C_threshold = Ω·ln(10) ≈ 115 (Rupture trigger)
- Rupture retention = 0.35 (35% of coherence preserved after rupture)
- Temporal decay α = 0.05 (Memory weakening rate)
- Spatial memory range = 150 units (Maximum influence distance for regeneration)
- Memory trace capacity = 500 events (Sliding window of historical states)
Interpretation Guide
The demonstration visualises several key components:
- Purple coherence field: Radius scales with C, showing total accumulated learning
- Pink trail: Memory trace showing recent trajectory
- Yellow/green particles: Learning events (food discoveries, successful avoidance)
- Pink explosion: Rupture event—behavioural reorganisation
- Real-time metrics: Track C(x), L(x,τ), R[χ], and rupture count as learning progresses
This demonstration shows that sophisticated adaptive behaviour can emerge from simple mathematical principles governing coherence accumulation, critical transitions, and exponentially-weighted memory regeneration—without explicit programming of survival strategies or reward-based optimisation.