When Brains Teach Machines: Notes from the Uneasy Borderland

Brains Don’t Compute Like We Think, Yet They Keep Predicting

Strip away the marketing gloss and you still have the weird fact: the brain runs on about 20 watts, keeps a body alive, models a world that never repeats, and does not crash when the light changes. That constraint space—energy, noise, partial information—is where neuroscience and machine learning quietly meet. Not as mirrors. As neighbors who share a fence. The cortex looks less like a file system and more like a rumor mill: layered cells exchanging graded probabilities, betting on the next millisecond. Inference as metabolism. Memory as constraint rather than static content.

Call it predictive coding if you like. Sensory areas don’t wait for data; they predict and correct. Surprise drives learning. You can feel this in the jolts—optical illusions that refuse to resolve, the face you “see” before it’s there. Modern sequence models echo the rhythm: compress the stream, anticipate the next token, revise the prior. A transformer is no cortex, but both earn their keep by minimizing error under a budget. The difference is spiritual, almost: neurons trade in relations, rhythms, and local signals; transformers in global attention and fast, clean gradients. Still, the family resemblance is not accidental.

This lens sits on a broader thesis: information might be the substrate; matter the behavior of information under constraint. If so, the brain is a device for sculpting constraints—synapses that don’t “store” images but reshape flows. Synaptic plasticity becomes a living prior. Sleep becomes maintenance of these priors. The self? A temporary compression that keeps the organism coherent across scattered inputs. That picture, contentious but productive, clarifies why AI trained on vast text caches can feel smart yet brittle: a torrent of patterns without slow-baked organismal constraints.

There’s technique under the poetry. Hippocampal replay anticipated experience replay in deep RL. Mushroom body circuits in flies nudged sparse representations and winner-take-all schemes. Spiking neural networks and neuromorphic hardware emerged from the hunch that time itself—delays, bursts, phase—carries computation. And the bridge keeps lengthening. If you want a long read that refuses to tidy the bridge too early, start here on neuroscience and artificial intelligence.

Learning Signals, Not Magic: Plasticity, Credit, and World Models

Backprop is a marvel, but it cheats—omniscient gradients, crystal-clear credit assignment. Neurons get no such privilege. They improvise. Dopamine spikes in the striatum carry a rough temporal-difference error; acetylcholine shifts plasticity windows; norepinephrine marks “pay attention, this matters.” No single god-signal. A messy parliament of modulators. That mess nudges AI toward local learning rules, meta-learning, and algorithms that do not assume the network can see itself from above.

Consider consolidation. Brains learn fast and slow at once. Rapid encoding in the hippocampus; slow cortical integration during sleep and quiet wake. Replay stitches episodes into structure. Deep RL copied a fraction of this—experience replay, target networks, distilled policies—but the richer point remains: organisms use many clocks. Short-term synaptic dynamics, day-scale homeostasis, life-scale development. Building systems that span clocks isn’t an elegance tax; it’s the price of stability. Models that only sprint never learn to walk.

Then there’s embodiment. A cortex alone is a misread. Bodies supply low-dimensional priors: geometry, friction, mortality. Try learning “throw and catch” from text and you get slogans; add tendon feedback and you get wisdom. World models grow tight when prediction is punished by physics. In robotics, closed-loop control with learned dynamics pushes beyond reward hacking toward earned competence. In language systems, grounding even in small sensorimotor loops—point, see, correct—changes what a token predicts. The map remembers the walk.

Credit assignment without backprop? Several paths. Synthetic gradients. Equilibrium propagation. Three-factor rules pairing pre/post activity with a scalar modulation—closer to how cortical and basal ganglia circuits might negotiate change. Reservoir and liquid-state machines hint that well-chosen dynamics plus sparse plasticity can do a lot of work. Neuromorphic chips—Loihi-class—play with spike timing and energy budgets that line up with biology’s harsher economics. None are turnkey. Yet each chips away at the myth that “general intelligence” rides on a single gradient and a single clock. The drift is visible: from monoliths to ecologies of learners, glued by imperfect signals, hardened by time.

Moral Memory, Incentives, and the Hard Problem We Keep Dodging

We talk safety and governance as if policy patches could turn extractive incentives into steady virtues. Not likely. Brains do not hold values as modular software; they inherit them as slow, communal moral memory. Rituals, taboos, stories—compression schemes evolved to keep a tribe coherent across generations. You can read religion that way without sneering or kneeling: a ledger of costly experiences the living wouldn’t choose to relearn. Machines trained on fresh internet pulp and quarterly objectives lack that sediment. They perform alignment as style.

So yes, audits and evals matter. But evaluating a system that can generate any sentence with equal grace is a mirror trick. Metrics collapse context. “Harmlessness” in a vacuum is not a property—it’s an absence. The stronger route is to reduce the vacuum. Embed models in institutions with skin in the game; expose them to real constraints and consequences over time; let reputation and memory form. Not PR memory. Operational memory: logs, interpretability traces, reproducible origins of behavior. Interpretability then isn’t for dashboard theater; it is how a community decides who is responsible when the system surprises itself.

Here, open science is less ideology than survival. Black-box giants force a faith economy. Withholding weights, training data, and objective functions creates a governance theater where authorities bless what they cannot inspect. Biology doesn’t run on NDAs. Reproducibility—share the protocol, share the error bars, share the failure cases—has been our only defense against elegant nonsense. The same, or more, for learning systems that will mediate credit, blame, and belonging.

There’s a practical angle. If neural systems are assemblages of local learners, then community oversight beats centralized guardianship. Universities, civic labs, hospitals, small manufacturers—places with long memory—could fine-tune models under domain constraints, iteratively, with clear provenance. Call it federated moral memory: different clocks, different costs, shared audits. It’s less efficient in the quarter. It pays in decades. And it asks a technical question still open: how to encode norms as constraints that update slowly, without freezing innovation or laundering power. The answer might require a more faithful import from biology—plasticity with vetoes, forgetting that is not erasure but slack, attention that can be earned back. Or something we haven’t named yet.

Rohan Deshmukh

Pune-raised aerospace coder currently hacking satellites in Toulouse. Rohan blogs on CubeSat firmware, French pastry chemistry, and minimalist meditation routines. He brews single-origin chai for colleagues and photographs jet contrails at sunset.

When Brains Teach Machines: Notes from the Uneasy Borderland

Brains Don’t Compute Like We Think, Yet They Keep Predicting

Learning Signals, Not Magic: Plasticity, Credit, and World Models

Moral Memory, Incentives, and the Hard Problem We Keep Dodging

Related Posts:

Leave a Reply Cancel reply