4 Comments

Great article! There is a lot of emphasis these days on the probabilistic nature of trained systems, but that may be more a means of getting mechanisms to emerge than being fundamental to the mechanisms these systems end up developing.

If you reverse-engineer a trained neural (deep learning) network, you do see patterns that correspond to components that are carrying out specific operations on data (looking for edges, shapes etc). That is to say even though we don’t plan and build the mechanism, training may be developing mechanisms that we could reverse-engineer that would help us understand what is going on. Trained algorithms as a tool of science even if not a product of science.

For a while in my career I worked with people who worked in one branch of symbolic AI called Case-Based Reasoning. The basic idea is that humans store a collection of memories of situations and stories, and then have ways of combining and adapting those stories in new situations.

I suspect that LLMs (ChatGTP etc) are doing something similar, looking for patterns in past cases and adapting them to new situations. The fact that LLMs can get so far on pattern matching and adaptation supports many insights from the CBR community. Humans seem to do a lot of remembering and adapting past cases. But also it seems not to account for all of human intelligence.

LLMs also sometimes fail. They are not great at reasoning, and sometimes invent new stuff without testing their suppositions against available evidence and background knowledge. A little like a bright enthusiastic High School student who has new and creative ideas about science but has not yet developed the discipline to take generated ideas as hypotheses rather than as knowledge.

Expand full comment
author

Ah yeah I wondered about the hidden layers identifying things such as edges or shapes whether that was always planned, or simply something we discovered that the neural networks did.

If the latter then working with neural networks become a bit like studying the human brain. You are also put in a situation where you have to reverse engineer how they work.

Of course we know it is all just matrix multiplications, but we may not know what kind of higher level constructs get manifest inside these networks.

Speaking of inventing stuff. I asked ChatGPT about my mother and her career. It wasn't entirely wrong. E.g. it correctly identified her as a journalist for Dagsavisen paper, but totally mischaracterized what sort of journalists she was.

I feel it is much the same when working with AI Art. It can get you into the ballpark quite well but still get many details spectacularly wrong in a way a human would not.

Expand full comment

I recently asked Bing's version of ChatGTP, I call it Bing-AI (since it does current web searches) what variant of COVID-19 is currently causing the most breakout infections. It responded with information that was current a couple of years ago, that is useless.

I explained to Bing-AI that new variants arise often and the mix of variants that cause disease changes very quickly based on evolutionary competition. It immediately found the variant that had caused the most disease in the last two weeks.

In this case Bing-AI was clearly lacking background knowledge. It seems to be able to work with the idea of rates of change if it is pointed out, it just did not occur to it to bring rates of change in to the search for an answer.

By the way, I think we are missing a great opportunity by not-reverse engineering what trained LLM networks are actually doing. I suspect we could learn a lot about human cognition that way.

Expand full comment
Apr 20·edited Apr 20

There is a related idea in analyzing human intelligence. Given the long delay that can be demonstrated in conscious reaction, where we react before we are conscious of the input, a hypothesis is that we're running a predictive model which allows us to react in real time. Perhaps humans are a generative ai with after-the-fact corrections. This may explain those videos where we watch a basketball game and don't notice the guy in the gorilla suit in the background, because how would we generate that - and if it does not interact with the events we are interested in we just eliminate it as noise.

Expand full comment