Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces

aithrowawaycomm 8 months ago

AI researchers need to read more cognitive science. It is genuinely embarrassing how often you see "Thinking Fast and Slow" + some 50-year-old paper as the only citations, because this statement:

  In human cognition theory, human thinking is governed by two systems: the fast and intuitive System 1 and the slower but more deliberative System 2.

is intuitive, psychologically seductive, and blatantly wrong.[1] There is no scientific distinction between System 1 and System 2, the very idea is internally incoherent and contradicts the evidence. Yet tons of ignorant people believe it. And apparently AI researchers sincerely believe "ANN inference = System 1 thinking." This is ridiculous: ANN inference = Pavlovian response, as found in nematodes and jellyfish. But System 1 thinking is related to common sense found in all vertebrates, and absent from all existing AI. We don't have a clue how to make a computer capable of System 1 thinking.

This isn't just pedantry: the initial "System 1 = inference" error makes "System 2 = chain-of-thought" especially flawed. CoT in transformer LLMs helps solve O(n) problems but struggles with O(n^2). The observation that a O(n^2) problem can be broken down into n separate O(n) problems is ultimately due to system 1 reasoning: it is obviously true. But it is only obviously true to smart things like humans and pigeons. Transformers do not seem smart enough to grasp it: system 2 thinking must be "glued together" by tautologies or axioms, and we can only recognize tautologies or discover axioms because of system 1. If the problem is more complex than O(n) these tautologies and axioms must be provided to the LLM, either with a careful prompt or exhaustive data.

Kahneman's book has been largely repudiated on the science. That doesn't mean it isn't a useful way to understand the kinds of errors humans make in decision-making. But it does make the book useless for AI researchers: I believe AGI is well over 200 years away, because going all the way back to Alan Turing AI has simply refused to engage with the challenges of cognitive science, preferring fairy tales which confirm intuitions and trivialize human minds.

[1] https://www.cell.com/trends/cognitive-sciences/abstract/S136... and https://www.psychologytoday.com/intl/blog/a-hovercraft-full-...

thorum 8 months ago

> We don't have a clue how to make a computer capable of System 1 thinking.
I think you’re overthinking this. System 1 thinking as the term is being used by AI researchers means making a fast decision based on reasoning processes that are wired into your brain by evolution. For any task that humans have faced for millions of years this works well. It can also work well for experts in a domain who have practiced a task so many times that their brains have adapted to perform it unconsciously.
System 2 thinking is consciously using explicit reasoning techniques to think through a problem, slowly and rigorously, often in ways that feel unnatural due to our cognitive biases but can solve problems that System 1 is unable to.
The analogy to LLMs is straightforward: LLMs learn to solve many kinds of complex problems during training and encode processes for those specific problems. They can then perform these tasks in a single forward pass through their weights. This is System 1 for LLMs and again, works well for any task that they were exposed to repeatedly during training.
However they don’t generalize to tasks that were not well represented in the training data. Training them to use explicit reasoning strategies instead (System 2) is shown to improve performance and let them solve a broader range of problems.
- grupthink 8 months ago
  
  System 1 and 2 is a myth. There is only memory and computation. For a complex problem, retrieving from memory is fast, and performing computation is slow. Furthermore, when performing computation, there are different heuristics that you can use to think about a problem, e.g. If you want to predict the orbit of a satellite, you can use Kepler's laws which gives you the full sweeping elliptical motion. Alternatively, you can use Newtons laws for which you need to calculate each time step. Alternatively, you can calculate all the quantum interactions between the satellite, earth, and sun (are we going to call this System 3 because it is more rigorous and is "closer to metal"?).
  
  mewpmewp2 8 months ago
  
  I don't get how you can conclude it is a myth. These are observations on how people think. What is the myth part. I can clearly observe myself doing fast intuitive decisions which I might even not know the logical reasoning behind, but also I can solve problems by thinking through them using my internal monologue. Are these myths?
  
  nighthawk454 8 months ago
  
  The error lies in thinking they're 'real' systems to be taken for granted and blindly reasoned forward from, instead of sometimes-helpful academic categorizations.
  You can always factor things into groups. e.g. 'Thoughts about now vs thoughts about the future'. Extending that to say there are therefore two modes of thinking and that the brain must handle your two groups differently, at some fundamental or physiological level, and there are only these two modes and all things are either one or the other ... is perhaps quite misguided without more support.
  "It's only a model"
  
  mewpmewp2 8 months ago
  
  I don't think we have conclusive evidence, but everything that I see from myself it makes sense to categorize like that. It is either a quick intuitive guess or feel or alternatively I have to hash it out. It just fits perfectly to me.
  I do think they are systems in a sense that one is optimsed to be a quick system and the other one requires time, but can solve tougher problems, create something new.
  It seems fundamental to me that it is how things would get organized.
  
  andrewchambers 8 months ago
  
  Ok, if we can define system 1 as retrieval and system 2 as computation and then we all agree.
lukev 8 months ago

So, in one sense I agree with you. There is zero evidence that the human brain runs separate systems for separate types of cognition.
On the other hand, the reason this idea is sticky is because it matches our conscious experience. In some situations, we respond intuitively. In other situations, we choose to work analytically using tools like research, deliberation, note-taking, etc.
I think it's this second sense in which people are using the term with respect to LLMs. And it's not a terrible analogy.
However, comparing "neural networks" to actual neurons is almost never useful.
- lukev 8 months ago
  
  Oh, and to be pedantic:
  > he observation that a O(n^2) problem can be broken down into n separate O(n) problems is ultimately due to system 1 reasoning: it is obviously true.
  As the parent of a third grader just learning this stuff, I can assure you it isn't immediately obvious to everyone.
  
  atq2119 8 months ago
  
  It's also not really true.
  Consider a problem like edit distance, which is solved using an n^2 dynamic program. What are the n separate problems there?
  Sure, you're filling out a table with a nested loop, but that's a very mechanistic view. I don't believe that treating the n outer iterations as separate problems gives any real insight.
SubiculumCode 8 months ago

That line in the abstract made me chuckle too, as a cognitive psychologist of memory (but currently doing autism research). Its an analogy, not some well validated law of cognition. I think there are a lot of concepts in cognitive psychology that may be useful to machine learning research, but they should actually, maybe, invite some cognitive scientists to work with them on their machine learning research (or at least help them craft their abstracts lol)
viraptor 8 months ago

> CoT in transformer LLMs helps solve O(n) problems but struggles with O(n^2).
What do you mean by `n` in this case?
jhanschoo 8 months ago

Yeah, they could call it something like retrieval-dominant information generation or computation-dominant information generation, or at least not mention cognitive sciences.
jonstewart 8 months ago

Thank you, I was hoping someone in the comments would point out that all this junk hasn't been replicable.

hislaziness 9 months ago

As I understand, the LLM uses the techniques of searchformer - https://arxiv.org/abs/2402.14083. To do "slow thinking" doing a A* search using a transofrmer.