CRR Fish Learning - Biological Intelligence Emergence

Mathematical Framework & Biological Motivation

▼

Core Mathematical Framework

This demonstration models biological learning through the Coherence-Rupture-Regeneration (CRR) framework. Unlike the spatial navigation demonstration, this focuses on how organisms accumulate experience, undergo learning breakthroughs, and reconstruct behaviour based on weighted historical memory.

C(x,t) = ∫ L(x,τ) dτ (Coherence: Accumulated learning through experience density)

Coherence C represents the fish's total accumulated learning. Each experience contributes a learning density L(x,τ) that integrates over time. High-stakes events (predator encounters) contribute more to L than routine exploration, creating differential memory weighting.

L(x,τ) = φ₊ - φ₋ (Memory density: Positive learning events minus stressful penalties)

The memory density captures the net learning rate at each moment. Predator proximity generates strong positive learning signals (φ₊ > 2.5), while energy depletion creates mild stress penalties (φ₋). Food discoveries provide moderate positive reinforcement.

δ(t-t₀) when C(x) > C_threshold (Rupture: Discontinuous reorganisation at critical coherence)

When accumulated coherence exceeds threshold (~115 in this implementation, approximately Ω·ln(10)), the system undergoes rupture—a learning breakthrough. Coherence resets to 35% of its previous value, but critically, the fish's behaviour has reorganised. This models phenomena like insight learning or phase transitions in skill acquisition.

R[χ](x,t) = ∫ φ(x,τ)·e^(C(x,τ)/Ω)·Θ(t-τ) dτ (Regeneration: Exponentially-weighted memory reconstruction)

Behaviour is continuously reconstructed through the regeneration operator R. Past experiences φ(x,τ) are weighted by accumulated coherence through the exponential term e^(C/Ω), where Ω = 50 acts as a temperature parameter. The Heaviside function Θ(t-τ) enforces causality—only the past influences the present, creating temporal asymmetry characteristic of biological systems.

From Markovian to Non-Markovian Behaviour

The fish begins with purely Markovian behaviour—current actions depend only on immediate sensory input. However, as coherence accumulates, the system becomes increasingly non-Markovian:

C ≈ 0 (Early behaviour): Random exploration, e^(C/Ω) ≈ 1, minimal memory influence
C ≈ 50 (Intermediate): e^(C/Ω) ≈ 2.7, past experiences begin shaping decisions
C > 100 (Advanced): e^(C/Ω) > 7.4, strong memory weighting guides sophisticated avoidance and approach behaviours

The intelligence parameter I = tanh(C/Ω) captures this transition, approaching 1 as the fish becomes fully history-aware. This creates a smooth continuum from reactive (Markovian) to anticipatory (non-Markovian) behaviour.

Biological Plausibility

This mathematical structure captures several phenomena observed in biological learning:

Differential encoding: High-stakes events (predation risk) create stronger memories than neutral experiences
Consolidation: The exponential weighting e^(C/Ω) mirrors how neural representations strengthen with repeated activation
Insight learning: Rupture events model sudden behavioural reorganisations documented in animal cognition
Temporal decay: The regeneration operator includes temporal decay (not shown explicitly but implemented as exp(-α·Δt)) matching synaptic weakening
Non-stationarity: Unlike fixed reward functions in RL, learning rate and behavioural policy evolve continuously

Comparison with Reinforcement Learning

Traditional RL approaches differ fundamentally from CRR dynamics:

CRR has no reward function: Behaviour emerges from coherence accumulation rather than reward maximisation
No value function: The fish doesn't estimate future returns; decisions arise from memory-weighted gradients
No policy optimisation: There is no fixed policy being improved—behaviour continuously reconstructs through R[χ]
Non-stationary learning: The effective "learning rate" scales with I = tanh(C/Ω), creating adaptive temporal structure
Rupture as feature: Discontinuous reorganisations are essential, not pathological

Observable Phenomena

Watch for these emergent behaviours as coherence accumulates:

Improved avoidance: Initially, the fish may swim towards predators. As C increases, avoidance becomes more reliable and occurs at greater distances
Directed foraging: Early food discoveries are accidental. With higher C, the fish begins seeking food more deliberately
Post-rupture refinement: After rupture, behaviour often becomes more efficient—the fish has "learnt how to learn"
Coherence field radius: The purple glow visualises accumulated coherence spatially, showing how learning becomes embedded in the environment
Memory vectors: Pink arrows show regeneration influences—past high-value locations pull current behaviour

Parameter Values

This implementation uses the following constants:

Ω = 50 (Temperature parameter controlling memory weighting sensitivity)
C_threshold = Ω·ln(10) ≈ 115 (Rupture trigger)
Rupture retention = 0.35 (35% of coherence preserved after rupture)
Temporal decay α = 0.05 (Memory weakening rate)
Spatial memory range = 150 units (Maximum influence distance for regeneration)
Memory trace capacity = 500 events (Sliding window of historical states)

Interpretation Guide

The demonstration visualises several key components:

Purple coherence field: Radius scales with C, showing total accumulated learning
Pink trail: Memory trace showing recent trajectory
Yellow/green particles: Learning events (food discoveries, successful avoidance)
Pink explosion: Rupture event—behavioural reorganisation
Real-time metrics: Track C(x), L(x,τ), R[χ], and rupture count as learning progresses

This demonstration shows that sophisticated adaptive behaviour can emerge from simple mathematical principles governing coherence accumulation, critical transitions, and exponentially-weighted memory regeneration—without explicit programming of survival strategies or reward-based optimisation.

Coherence-Rupture-Regeneration: Biological Intelligence Emergence