Thanks, this helped crystallize something for
me: the play the AI labs are making is
anti-fragile (in the Nassim Taleb sense):
> The very act of resisting feeds what
you resist and makes it less fragile to
future resistance.
At least along certain dimensions. I don't
think the labs themselves are antifragile.
Obviously we all know the labs are training
on everything (so write/act the way you want
future AIs to perceive you), but I hadn't
really focused on how they're absorbing the
innovation that they stimulate. There's
probably a biological analog...
Well there are many, and I quote this AI
response here for its chilling parallels:
> Parasitic castrators and host
manipulators do something related. Some
parasites redirect a host’s resources away
from reproduction and into body maintenance
or altered tissue states that benefit the
parasite. A classic example is parasites
that make hosts effectively become
growth/support machines for the parasite. It
is not always “stimulate more tissue, then
eat it,” but it is
“stimulate more usable host productivity,
then exploit it.”
(ChatGPT 5.4 Thinking. Emphasis mine.)
Instead of anti-fragility, I'd point you to
the law of requisite variety instead. You'll
notice that all AI improvements are insanely
good for a week or two after launch. Then
you'll see people stating that 'models got
worse'. What happened in fact is that people
adapted to the tool, but the tool didn't adapt
anymore. We're using AI as variety resistant
and adaptable tools, but we miss the fact that
most deployments nowadays do not adapt back to
you as fast.
New models literally do get worse after
launch, due to optimization. If you charted
performance over time, it'd look like a
sawtooth, with a regular performance drop
during each optimization period.
That's the dirty secret with all of this
stuff: "state of the art" models are
unprofitable due to high cost of inference
before optimization. After optimization they
still perform okay, but way below SOTA. It's
like a knife that's been sharpened until
razor sharp, then dulled shortly after.
> If you charted performance over time,
it'd look like a sawtooth
People have, though, and it doesn't show
that. I think it's more people getting hit
by the placebo effect, the novelty effect,
followed by the models by-definition
non-determinism leading people to say things
like "the model got worse".
Is this insider info? The 'charted
performance' caught my eye instantly. Couple
things I find odd tho: why sawtooth? it would
likely be square waves, as I'd imagine they
roll down the cost-saving version quite fast
per cohort. Also, aren't they unprofitable
either way? Why would they do it for
'profitability'?
It's rumors based on vibes. There are attempts
to track and quantify this with repeated model
evaluations multiple times per day, this but
no sawtooth pattern has emerged as far as I
know.
I don't want to go too far down the conspiracy
rabbit hole, but the vendors know everyone's
prompts so it would be trivial for them to
track the trackers and spoof the results. We
already know that they substitute different
models as a cost-saving measure, so
substituting models to fool the repeated
evaluations would be trivial.
We also already know that they actively seek
out viral examples of poor performance on
certain prompts (e.g. counting Rs in
strawberry) and then monkey-patch them out
with targeted training. How can we be sure
they're not trying to spoof researchers who
are tracking model performance? Heck, they
might as well just call it "regression
testing."
If their whole gig is an "emperor's new
clothes" bubble situation, then we can
expect them to try to uphold the masquerade
as long as possible.
It's not insider info, it's common knowledge
in the industry (Google model optimization). I
think they are unprofitable either way, but
unoptimized models burn runway a lot faster
than optimized ones.
The reason it's not a square wave is because
new optimization techniques are always in
development, so you can't apply everything
immediately after training the new model. I
also think there's a marketing reason: if
the performance of a brand new model
declines rapidly after release then people
are going to notice much more readily than
with a gradual decline. The gradual decline
is thus engineered by applying different
optimizations gradually.
It also has the side benefit that the future
next-gen model may be compared favourably
with the current-gen optimized (degraded)
model, setting up a rigged benchmark. If no
one has access to the original pre-optimized
current-gen model, no one can perform the
"proper" comparison to be able to gauge the
actual performance improvement.
Lastly, I would point out that vendors like
OpenAI are already known to substitute
previous-gen models if they determine your
prompt is "simple." You should also count
this as a (rather crude) optimization
technique because it's going to degrade
performance any time your prompt is falsely
flagged as simple (false positive).
> The very act of resisting feeds what you resist and makes it less fragile to future resistance.
At least along certain dimensions. I don't think the labs themselves are antifragile. Obviously we all know the labs are training on everything (so write/act the way you want future AIs to perceive you), but I hadn't really focused on how they're absorbing the innovation that they stimulate. There's probably a biological analog...
Well there are many, and I quote this AI response here for its chilling parallels:
> Parasitic castrators and host manipulators do something related. Some parasites redirect a host’s resources away from reproduction and into body maintenance or altered tissue states that benefit the parasite. A classic example is parasites that make hosts effectively become growth/support machines for the parasite. It is not always “stimulate more tissue, then eat it,” but it is “stimulate more usable host productivity, then exploit it.” (ChatGPT 5.4 Thinking. Emphasis mine.)
reply