42 Comments

I think this is starting to quietly hit the mainstream. Researchers found that AI model collapse happens incredibly quickly once they start training on AI generated content.

https://venturebeat.com/ai/the-ai-feedback-loop-researchers-warn-of-model-collapse-as-ai-trains-on-ai-generated-content/

Expand full comment
author

Nice find - I suspected my background research had missed the thing I was looking for.

Expand full comment
Jun 20, 2023Liked by Brian Mowrey

Your analysis is much more comprehensive! I had the same realisation as you... that this problem is so fundamental and so catastrophic that it has to be known by industry insiders. LLMs are fatally flawed by design, it seems like it will end up being a fun little novelty that won't really go anywhere.

Expand full comment
Jun 20, 2023Liked by Brian Mowrey

Thanks, very interesting. The distortion by propagation and copying reminds me that happened even when there was only print media. I recall a common statistical error repeated endlessly in a niche of the brittle fracture field. Each author reporting results in a variety of journals would quote the first one in a series which had an error in the equation for Weibull stats of brittle fracture. It propagated through the literature. My group working in that field found the error.

Expand full comment
author

Or take "cytokine storm" - proposed based on nothing and then quoted and quoted and quoted. This is why like I said AI *might* improve medicine, it's hopeless in the hands of humans. It's other industries that are going to produce worse output.

Expand full comment

Have you eaten your spinach yet, to get sailor-strength iron?

Expand full comment
author

I was mid bite and had to move continents due to climate change, at this rate I'll NEVER finish lunch

Expand full comment

Pfff, what could possibly go wrong with stitching together feedback loops with noise amplifiers/mixers in series.

I don't know anything specific to these language models.

But generally to me, both fascinating and frightening is the aspect that machine learning was invented to _get around_ having to actually understand (aspects of) the problem you're trying to solve - and this seems to be embraced quite happily.

That bowl of salad, over that disgusting grid patterned slab of who knows what, which that women presumably mistakes for chocolate (but no idea why that closeup smelling) - any day of the week.

Expand full comment
Jun 21, 2023Liked by Brian Mowrey

Yeah I felt a little like puking watching her sniff the chocolate while stowing bites of salad. I could hear gears stripping in my mind.

Expand full comment
author

I'm comfortable in the abstract with LLMs as a "crutch" for first-hand knowledge, just a tool to make the job of thinking easier. But it would take what Jon describes as a Terminator reality-scanning AI to perform that job without self-poisoning the training library. The version we have should quickly make source material for some funny post-apocalypse historical reconstructions of the present, at least.

Expand full comment

A colleague was testing with picking a new IC that had certian properties. Since one thing he forgot about was suboptimal, he asked "and one like that, but with better X?" and it worked. So... this "until 2021" dataset has... IC datasheets in it. Funky. There is really an obsolescence problem there - a chip we started using 2..3 years ago is already out of support. But in principle, that saved a bit of stupid research work. (distributors component filtering features on their websites aren't always that great or comprehensive... reading a lot of of datasheets that partly follow different conventions is not fun)

Expand full comment

I especially appreciate part iii, "What’s the difference between human iterative inaccuracy and Iterative LLM-inaccuracy?" I was going to mention it if you didn't.

We have a need for absolute truth, and we have a war against absolute truth. These two appear to have coexisted all along.

Personally, I look to certain ancient sources to get my bearings, but I realize that even relying upon those sources can lead to uncomfortable questions about what is true, when examined very closely. There comes a point where I have to make choices about what to believe is true, and to stick by those choices in the face of uncertainty, but not in the face of strongly opposing evidence.

One of my beliefs is that we are designed to be able to do this.

I am continually astonished by the observation that modern thinking does not seriously consider the question of where we came from and why, preferring to stick with answers that don't make sense.

One comment on this post does touch upon that last matter: https://unglossed.substack.com/p/the-danger-of-ai-comes-down-to-an/comment/17539123

Expand full comment
author
Jun 20, 2023·edited Jun 20, 2023Author

When people ask me IRL what I research and write about, it's always alarming to remember that the idea of ingesting contemporary biology to look for any possible avenue of progress in understanding how life came about... is *weird.* Why wouldn't be almost the only thing a secular society focused on (which would lead to a lot less secularity probably, it does for me)?

Expand full comment

The big illusion that most people, maybe not many here but I did until I started researching, is that AI is actually kind of looking up data in a table, and that updating its response is as easy as giving it an updated table. Of course, that illusion is downstream of the illusion that there is an actual 'entity' of some sort producing the text but anyway... It doesn't have a list of fireworks destinations and how they were rated by visitors and how much traffic they got. More directly it has no idea if fireworks will be shot off at a particular location this year. The 'dynamic data set' illusion is another one that we have, that it sort of looks up data from google the same way that we would. But, as Jon pointed out much more knowledgably than I can, the data has to be carefully groomed by real people over a years long process which assures obsolence in many areas. I think that the obsolence problem, while not as game ending as the iteration problem will be a big cramp in LLM style sooner than the iteration problem. The iteration problem is probably unsolvable.

Expand full comment
Jun 20, 2023Liked by Brian Mowrey

Maybe the most lucrative use of LLM would be cranking out trash novels.

Expand full comment
author

What would actually be fantastic, and seems plausible, is fully iterative image generation, which would finally democratize comic book creation. Writers could bring any idea to page. Granted, this will mean the end of self-referential carpal tunnel syndrome diary comics.

Expand full comment

My daughters watch too much youtube and my wife and I have been engaged in a debate over whether the videos are made by AI or just poorly translated from some other language/culture. The stories have this sort of flat emptiness that feels very familiar from ChatGPT. It feels like an LLM was told produce a story about x, y, and z where this happens but none of the connections quite connect up.

Expand full comment

Reminds me of the time two friends and I decided the way to get rich was to write a porno novel. We were in high school so the best title we could come up with was "Lust Professor". We got a few sentences in and gave it up. We could really have used LLM.

Expand full comment
author

Those videos are crazy to watch. But seem bot-generated and pretty remedial products - reword a news story, block it up into sentences, voicify each sentence over a hit from a library of tagged stock footage. Result, surreal SEO parasite cinema.

Expand full comment

Something like that. They watch these bizarre animated fairy tales where nothing quite matches. One day I will put in the effort to find out the truth. Maybe I am in a not quite parallel universe or maybe they weren't made by humans. I think they type in 'write a story with these characters and this moral and animate it in this style' to some industrial LLM or something like that.

Expand full comment
Jun 20, 2023Liked by Brian Mowrey

My sons once had had an animated VHS of similar type, a present from someone. It was atrocious. And this was before AI. Always wondered how anything that bad got loose.

Expand full comment
Jun 20, 2023Liked by Brian Mowrey

Check out what people are doing with Midjourney. Consistency between sequential images is a challenge, but not an insurmountable one.

Expand full comment
author

Will give it a look, thanks

Expand full comment
deletedJun 20, 2023·edited Jun 20, 2023Liked by Brian Mowrey
Comment deleted
Expand full comment

Of course I would! But I have dogs.

Expand full comment
Jun 20, 2023Liked by Brian Mowrey

Interesting post. Again.

I don't use AI. At least not intentionally. I avoid it the same way I avoid Wikipedia and MSM, and for that matter, mass produced junk food at the supermarket. But if I understand your thesis, our sources of knowledge are going to be so tainted by it in every sphere by those who do use it that we won't escape its homogenizing effects. Your third paragraph from the end is interesting.

Expand full comment
author

Which part of the third? I did not expand much on the rebuilding problem. What I think will happen is we will start to realize and agree that we wish we could have human-staffed knowledge networks in a lot of businesses but, the same way I can't understand what anyone on the other end of a Wells Fargo customer service call is saying, it won't happen.

Expand full comment
Jun 21, 2023·edited Jun 21, 2023

What's interesting is the suggestion that the Internet will be a pitfall in the acquisition of new knowledge, once contamination by AI has caused science to approach

some trashy asymptote. How would we rebuild without an Internet? What would be the solution? Now that would make an interesting plot for a novel or movie.

Expand full comment
Jun 20, 2023·edited Jun 20, 2023Liked by Brian Mowrey

Can you train AI to spot differences between AI generated content (party goer pictures with too many teeth and fingers) and human generated content?

Maybe AI is better at spotting similarities and humans are better at spotting differences. We're more prone to mistakes during normal driving, but can spot the difference between a harmless sheet of plastic blowing across the road and something dangerously falling off a truck, whereas an AI model will see them as similar.

Expand full comment
author

Seems feasible. The big difference is that humans aren't going to start having a different amount of fingers tomorrow, whereas with economic and social institutions there is constant change, and the people in charge of knowing the material are going to be replaced with LLMs to some degree or completely, depending on the sphere. Right now companies are trying to find use for AI to (hopefully) grow revenue and figure out how to reduce staff later.

Expand full comment

This seems like a neat story, a sort of Transhumanism Hoist on its own Petard where AI rejects a picture of a human with extra fingers or a long neck or something as being AI-created but it isn't that the image is AI created but the person in it was actually misbegotten in a lab by an AI and some sort of hijinks/serious consequence(depending on the writer) ensues.

Expand full comment

This is how it seemed to me. To take your neat example "It recommended two obvious answers and an off-the-beaten path answer". The problem as I see it is that eventually all public fireworks displays would end up at those same three places. We would be forced into that, our world would shrink. The more complex the problem the more difficult it would be to escape that self-reinforcing ChatBot answer. As it is/was just think how the big name media shape the framework of debate. The most absurd recent example is the British BBC with its “Verify” project where in effect the BBC marks its own homework. I am not saying that there should not be cross checks on information within an organization, but verification can only work by challenge from different views. It’s like Popper’s falsification.

Expand full comment

Like Google GPS sending 1000 cars an hour down some two lane backroad that goes through the family farm.

Expand full comment
author

That's a good example - Waze was the innovator here, and I can actually remember the day in 2015 I first saw a car came hauling down Calaveras St in the middle of the day while I was working FedEx. And immediately "neighborhoods" became "shortcuts," it was a permanent noticeable change.

Google imitated afterward, but seems to have become more conservative lately, so that now I have to go back to my own creative thinking when I'm stuck on a highway. I think there's been a lot of human curation on the shortcuts behind the scenes.

Expand full comment
author

The current systems seem well-trained to interpret requests for creative thinking, and as long as the training material has scope and accuracy, produce answers that exceed normal human performance in non-obviousness. This is why in political and taboo topics the models are heavily pre-censored from "saying what they really think." So in this realm AI could liberate media from human self-censorship, except humans will try their best to prevent it from doing so. But that creativity is also volatility, later you will have the 1/5 nonsense answers being fed back into the training material unless humans screen every text, but the humans have to actually know the material.

Expand full comment
deletedJun 20, 2023·edited Jun 20, 2023Liked by Brian Mowrey
Comment deleted
Expand full comment
author

But even in the vaulted version you can add data on the front end, i.e. "suppose there is a room with a table" and now the bot is running a simulation for you. So APIs can probably enable a certain amount of pro user tools out of that. The sandbox is still the sandbox, but it knows how to interpret and run an iterative system out of provided end user values. That seems sufficient until the sandbox becomes out of date and now you have a poisoned internet that can't update training safely, or requires ten times the work of curation as the first set did.

RE fizzling out like self-driving, the big question is, how easy is office work? A lot of small business office work involves comprehension of incredibly complicated systems where a pricing structure etc. was created to fit customers that all have their own unique exceptions to rules. But LLMs might be able to handle this and train on it more easily than humans. Media is probably the simplest job for LLMs to replace, you're just making copy.

Expand full comment
deletedJun 20, 2023·edited Jun 20, 2023Liked by Brian Mowrey
Comment deleted
Expand full comment
author

I think we are in agreement that it isn't going to actually have a sustainable business application. The problem is whether entropy rests with hiring the alcoholic niece back or just letting the now dysfunctional business keep displeasing edge-case customers while Bob and Mindy enjoy the higher margin they already sunk an up-front payment into. But really I mean more on the corporate scale, there. You will have boards pushing companies to grow market share today with temporary low prices enabled by AI / firing of the marketing department regardless of tomorrow's dysfunction.

So, ideally, either 1) the failure point is well before then. I mean if they can't even fix hallucinations then we're pretty safe, but that doesn't sound too hard. 2) This problem is totally centric to corporations influenced by board-think, and small local businesses don't have the same race to the bottom and in fact come to hold most of the market share. 2 is actually quite utopian, so maybe AI is great.

Expand full comment
deletedJun 20, 2023Liked by Brian Mowrey
Comment deleted
Expand full comment
author

Agree with that. The Hinton / Ng doomporn from last week seems like a half-baked guerrilla marketing campaign, either that or they are goofs.

Expand full comment
deletedJun 20, 2023·edited Jun 20, 2023Liked by Brian Mowrey
Comment deleted
Expand full comment
Jun 20, 2023·edited Jun 20, 2023Liked by Brian Mowrey

“To me, this doesn't actually get interesting until the open source versions really start to make … levels of trolling and shitposting possible at scale …”

Something you may find interesting; GPT-4chan, if you haven’t already seen it.

https://youtu.be/efPrtcLdcdM

Expand full comment
Comment deleted
Expand full comment

Inflamed cynic level: 4/5

Isn't the cost of operating that, at least in its current form, rather expensive, though? So, only for bigger joints.

Expand full comment
author

I am on a lifelong quest to share this at every applicable prompt https://youtu.be/GrpS2KhiXAM?t=3

Expand full comment
Jun 20, 2023Liked by Brian Mowrey

"I find the idea they have that they will be able to gatekeep something like this just utterly laughable"

------

I keep calling it "maybe we can just peek into Pandora's Box".

Expand full comment