A Checkpoint in My Thinking: Curiosity, Intelligence, and the Spiral That Keeps Closing

December 15, 2025 · 13 min read

#Artificial Intelligence #Consciousness #Epistemology #Machine Learning #Curiosity

I recently read Michael Timothy Bennett’s doctoral thesis, How to Build Conscious Machines. I did not finish it and immediately feel "done." I finished it and felt "pulled." Pulled into a very particular kind of clarity: the sense that several different intellectual traditions are starting to rhyme.

Not in the shallow way where everything sounds similar if you squint hard enough. More like this: when you try to explain intelligence well enough to build it, you keep rediscovering the same structural necessities. And when you build systems that approximate intelligence, you keep being forced to revise what you thought intelligence was.

That loop is what I want to record here as a checkpoint.

This is not a review in the usual sense. It is closer to a personal map: how I am currently connecting curiosity research, modern AI (LLMs, world models, agents), control theory, and Popper/Deutsch-style epistemology into one converging picture.

The claim I am testing is simple:

Intelligence is not primarily about having the right answers. It is about having the weakest constraints that still work, and then learning to place those constraints deeper and deeper in the stack of a system, so the whole organism (or agent) can keep adapting.

And curiosity is not a poetic personality trait. It is a control signal that tells a system where its constraints are too tight.

What I mean by "convergence" #

When I say "convergence," I do not mean consensus. I mean something more technical:

Different frameworks are getting forced into the same minimal explanatory skeleton.

They use different vocabularies:

Neural networks: representation learning, generalization, emergent abilities
Cognitive science: predictive processing, relevance realization
Neuroscience: reward circuits, learning, memory consolidation
Cybernetics: feedback, control, stability, adaptation
Philosophy of science: conjecture, refutation, explanation, counterfactuals

But they keep orbiting the same questions:

How does a system survive uncertainty without hard-coding every response?
How does it generalize beyond the training distribution of its own life?
Why does it actively seek information instead of passively waiting for data?
Why do value and meaning refuse to stay separated?

Bennett’s thesis hit me because it tries to put a single formal spine under many of these questions. It does not just offer a new metaphor. It tries to offer a rule.

The first jolt: w-maxing, not simp-maxing #

The sharpest idea I took from Bennett is his central contrast between what he calls "simp-maxing" and "w-maxing."

"Simp-maxing" is the instinct most of us have been trained to respect: prefer simpler hypotheses, shorter descriptions, cleaner rules. It is the vibe of Occam’s Razor.

Bennett’s pushback is basically: that is a property of form, not function. Generalization is about function. There is no deep reason simplicity in form must correlate with generality in function. When the two correlate, it may be contingent on the stack of constraints that interpret a policy, not a universal rule.

So he proposes a different optimization target: choose the weakest constraints that still solve the task.

In my own words: a better explanation is not "shorter." It is "less committed."

This is subtle, but it changes everything.

If you commit to too much structure, you will be correct in a narrow slice of the world and brittle everywhere else. If you commit only to what is necessary, you can survive more worlds with the same internal machinery.

Bennett reports that, in his experiments, w-maxing generalizes faster than simp-maxing (he gives a range on the order of 110% to 500%). That number is not the important part. The direction is. It supports a mental pivot I have been slowly making: generalization is about least commitment, not shortest code.

This idea also explains a common pattern in modern ML: systems that perform spectacularly on benchmark-like distributions yet fail under small shifts. It is not always because they are "not big enough." Sometimes it is because they have learned constraints that are too specific. They over-commit.

So if I had to distill this checkpoint into one sentence, it might be:

Progress toward intelligence looks like learning to relax constraints while staying correct.

Why this resonates with Hinton (and why "autocomplete" misses the point) #

This is where I feel a genuine bridge to Hinton’s stance against the dismissive claim "LLMs are just statistical autocomplete."

I do not think Hinton’s core point is that LLMs are secretly human. I think the core point is that the internal structure matters more than the interface.

A model trained on next-token prediction can still develop high-dimensional features that are compositional, reusable, and context-sensitive. If those features can combine into stable meaning structures, then "autocomplete" is describing the training interface, not the cognitive phenomenon.

Bennett gives me a language to say this more precisely: "understanding" is not symbol lookup; it is a policy embodiment. A system is constrained to behave coherently across contexts. The question is not whether it predicts the next token. The question is whether the constraints it has learned are weak enough to transfer.

So the productive debate is not "does it understand?" in a vague humanistic sense.

A better question is: what kind of constraints did it internalize, and how weak are they?

If LLMs really are learning weak constraints in some domains, that is a real form of understanding. If they are learning brittle shortcuts in other domains, that is not.

Same architecture. Different learned constraints. Same word "understanding." Different reality.

The second jolt: the Law of the Stack and delegated adaptation #

The next idea that snapped into place for me is what Bennett calls the Law of the Stack.

A short version is: adaptability at higher levels depends on adaptability at lower levels. If the lower layers are rigid, the whole stack becomes brittle.

This matches a frustration I have had for a while with modern AI:

Many current systems adapt mostly at high abstraction layers.

Pretraining updates parameters (huge adaptation, but offline).
Fine-tuning updates parameters (smaller adaptation, still offline).
Inference largely freezes the machinery (online behavior, but the adaptation budget is limited).

Biological systems are the opposite. They delegate adaptation down the stack:

Cells adapt.
Networks of cells adapt.
Tissues adapt.
The organism adapts.
Sometimes the organism even changes the environment to make its own future easier.

When adaptation happens at many levels simultaneously, weak constraints can take simple forms at each level. When adaptation is centralized, weak constraints become expensive to represent and maintain.

This is why Bennett’s argument about biology being "more adaptable" than conventional computers, in certain senses, feels plausible even before you decide whether you accept his consciousness claims. It points to a structural asymmetry, not a mystical one: where and how adaptation is implemented.

This is the first place where "control theory" stops being an abstract metaphor and becomes an engineering constraint:

It is not enough to have feedback. You need feedback at the right level.

Curiosity returns: information gaps as constraint alarms #

This is where my curiosity reading suddenly became more than psychology.

In Loewenstein’s information gap theory, curiosity is a kind of cognitive deprivation: you feel a gap between what you know and what you want to know, and that gap motivates information seeking.

In Kidd and Hayden’s synthesis, curiosity is tied to reward anticipation: curiosity behaves like a drive, and it recruits reward circuitry in a way that can enhance learning and memory.

I used to read these as descriptive accounts of human motivation. Now I see them as a control story:

A system needs a signal that detects when its current model is too specific for the world it is likely to face. That signal should feel like "tension." And it should bias behavior toward the actions that reduce the tension.

If w-maxing is the direction of optimal learning (weaken constraints while staying correct), then curiosity is the mechanism that helps you notice where the constraints are too tight.

Loewenstein gives the phenomenology: the gap feels like hunger. Kidd & Hayden give the mechanism: information becomes reward-like. Bennett gives the structural interpretation: the agent is drawn toward states that improve the policy because representation and value are not cleanly separable.

So curiosity becomes the name for an internal pressure toward weaker, more general constraints.

That framing also explains something I have felt in my own learning: the most addictive questions are not the ones that are totally unknown. They are the ones where I can sense the shape of the missing piece. The gap is visible and feels closable. That is the sweet spot where exploration becomes irresistible.

Free energy, compression progress, and a single underlying optimization #

I have long suspected that Friston’s free energy principle and Schmidhuber’s compression progress are describing the same beast from different angles.

Free energy: minimize surprise, keep your model aligned with incoming data.
Compression progress: seek experiences that improve your compression (learning progress), because "beauty" increases when the model gets better.

Bennett’s story gives me a third angle:

Better compression means weaker constraints that still succeed.
Weaker constraints mean broader generalization.
Broader generalization means less surprise across more environments.

So I increasingly see these as coordinate transforms of one optimization process: build a more compressible generative model that remains useful under perturbation.

If there is convergence, it may look like this:

Curiosity pushes you toward learning progress. Learning progress pushes you toward compression. Compression pushes you toward weak constraints. Weak constraints push you toward generalization. Generalization pushes you toward survival in unfamiliar worlds.

Different languages, one loop.

Where Popper and Deutsch fit: explanation over prediction #

This is where philosophy stops being decoration.

Popper’s "conjecture and refutation" is, at heart, an error-driven learning story. Knowledge grows by proposing something that can fail, then letting failure reshape your model.

Deutsch pushes this into a stronger claim: good knowledge is not just predictive. It is explanatory. It supports counterfactuals: not only "what will happen," but "why this and not otherwise."

At this checkpoint, I see Popper/Deutsch as specifying the quality criteria for the constraints we are learning.

A weak constraint that survives many refutations starts to look like an explanation. It remains stable under intervention and under re-description. It transfers.

This also helps me avoid a common trap in AI discussions: reducing intelligence to benchmark accuracy. Prediction alone is not the endgame. Explanation is.

So a sharper definition of "understanding" begins to emerge:

Understanding is the ability to keep the same internal structure under counterfactual variation.

If you can rewrite the question, change the surface form, alter the goal, and the system still preserves the deep invariants, then it has learned weak constraints. If it collapses, it had memorized a narrow policy.

The controversial part: value, valence, and "tapestries" #

Bennett’s most provocative move, for me, is his insistence that representation and value are integrated in biological systems.

In computer science we like clean separations:

perception produces a state
then we evaluate reward on top

Bennett argues that this separation is biologically implausible. In his language, adaptive systems are attracted to and repelled from states (valence), and scaling that up creates "tapestries of valence" where many parts of a system pull in different directions simultaneously. The tapestry itself becomes the classifier.

Even if I do not accept every step of his argument, I find the direction useful because it touches a modern AI pain point:

We still treat reward as a label. We still bolt preference on after the fact. We still treat meaning as something that can float above value.

But human cognition does not feel like that. Meaning and care are tangled. Relevance is not optional.

So Bennett’s insistence on integrated representation and valuation feels less like metaphysics and more like an engineering warning: if you keep the two separate, you may cap your adaptability.

This is also where curiosity makes deeper sense: information is rewarding because, in a valence-integrated system, the system literally prefers states that reduce its own model deficit.

Conscious machines and the Temporal Gap #

The thesis is titled How to Build Conscious Machines, so I cannot avoid the consciousness part. But for this checkpoint, I want to treat it as a framing device rather than a conclusion.

Bennett proposes that consciousness is not an extra magic layer you add on top of intelligence. Instead, consciousness and intelligence are linked through adaptation and through the architecture that supports it.

His key unresolved issue is what he calls the Temporal Gap: Must a conscious state be realized at a single point in time (synchronous), or can it be "smeared" across sequential computation?

This matters because it reshapes the debate about whether current LLMs could be conscious. If consciousness requires synchronous realization of a rich tapestry, then modern sequential hardware may never qualify. If it can be smeared across time, then software consciousness becomes more plausible, but the implications become strange.

What I appreciate here is not that Bennett "solves" consciousness. It is that he forces the debate to specify an implementation assumption that most people leave implicit.

At a minimum, this gives me a cleaner way to talk about consciousness claims around AI without getting trapped in vibes:

What architectural features are necessary?
What counts as synchronization?
Where does valence live?
How deep is adaptation delegated?

Even if one rejects Bennett’s answers, these are good constraints for building better questions.

The spiral: brain inspires AI, AI reframes the brain #

At this checkpoint, the most important meta-idea for me is the spiral itself.

We learned from brains. We built neural networks. We got LLMs. LLMs showed surprising abilities. Now those abilities and failures are forcing us to revise our theories of what brains are doing. Those revised theories will shape the next architectures.

This is not "AI mimics the brain" in a naive sense. It is closer to: building forces clarity.

Bennett’s thesis, regardless of whether it is ultimately right, represents a strong turn of the wheel because it tries to produce a single language that applies to biological and artificial systems. It asks: are you delegating adaptation deep enough? are you learning weak enough constraints? are you integrating value with representation? do you need synchronous polycomputation?

These are the kind of questions that make the spiral productive.

What I believe right now (and what I do not) #

Here is my honest checkpoint.

I believe:

Intelligence is best understood as efficient adaptation in a changing world, not as a static library of answers.
Generalization is better described as least commitment (weak constraints) than as shortest description (simplicity).
Curiosity is a control signal that detects constraint mismatch and drives information-seeking that improves the model.
The depth and distribution of adaptation across a stack is an under-discussed bottleneck in current AI.
The boundary between "meaning" and "value" is likely less clean than our software abstractions assume.

I do not believe (yet) that I have enough clarity to assert:

whether Bennett’s specific consciousness claims are correct,
whether the Temporal Gap is fundamentally undecidable,
or whether biological substrates are necessary for consciousness.

But I do believe that the direction of inquiry is right: stop arguing at the level of labels, and start arguing at the level of constraints and architectures.

Where I want to go next #

If this checkpoint is a map, the next steps are the concrete questions it suggests.

How do we make "weak constraint learning" operational in modern training? Not as a slogan, but as a measurable tendency. What objectives, environments, or curricula push systems away from brittle shortcuts?
What would "delegated adaptation" look like in an AI architecture? LoRA and RLHF operate high in the stack. What does it mean to adapt lower, closer to the substrate? What are the computable analogs of cellular plasticity?
Can we define an empirical signature for integrated representation and valuation? If Bennett is right that "neutral representation" is a myth in biological intelligence, then we should see measurable patterns in behavior and learning that reflect this integration.
What kind of world models are needed for Popper/Deutsch-style explanation? How do we build agents that do not just predict, but propose refutable hypotheses, test them, and preserve invariants under counterfactual variation?

This is why I am excited. Not because I think the truth is already in my hands, but because the loop is tightening: the space of vague explanations is shrinking, and the space of precise, testable architectural questions is growing.

That, to me, is what convergence looks like.

←

AI's Cambrian Moment: When Intelligence Gained the Ability to Self-Generate

The First Principles of AI Product Design: Context and Intent

→