I think that by-and-large the AI Futures Project have a very strong case for AI as an abnormal technology - but I think there are a couple of flawed claims, maybe even sleights-of-hand, in this post. I hope I can highlight these, purely for the sake of the quality of the debate, without it appearing as though I disagree with AIFP's overall argument (I don't!)
1) "[Paraphrasing] Human teenagers learn to drive within hours because they're incredibly data-efficient":
Having taught adults to ride bicycles and motorcycles, I'm constantly amazed by how natural and intuitive people seem to find these skills. To take a corner on a motorcycle, for instance, you have to move the handlebars in a VERY counterintuitive way (known as counter-steering), tilt your hips and shoulders, and do half a dozen other things - and yet you don't teach any of this, you teach "Look in the direction you want the bike to go, not the direction the bike is currently going, and it will go there" and the student's body naturally does almost-all the right things (the role of the teacher is then to identify the one or two imperfect things and correct these). The student doesn't even realise - and is usually quite skeptical when you tell them - that their body is unconsciously pointing the handlebars left when they make the bike turn right and vice-versa!
I don't think this is easy to explain in terms of data-efficiency alone - after all, the student isn't generalising from a very small number of examples, they're somehow obtaining the right answer despite no examples, no direct instruction, and a very counterintuitive mechanism they clearly can't reason-out the workings thereof.
I think it's possible that, in some sense, people have *always* been able to ride bicycles and motorcycles, without instruction, even before these technologies existed:
Imagine an absurdly sci-fi car of the sort imagined in 1950s retrofuturism, with a bewildering array of knobs, dials, switches, and gauges, but no steering wheel, accelerator, clutch pedal, etc. etc. You would expect that a normal car driver wouldn't be able to drive this car - but if you can show them two buttons that effectively turn an "imaginary" steering-wheel clockwise and counter-clockwise, a knob that represents the angle of an imaginary accelerator pedal, a switch that effectively depresses an imaginary clutch pedal, etc. they might be able to learn to drive your technocar far quicker than they originally learned to drive the first time around - the "how-to-drive circuits" are already in their heads, they're just hooking them up to new inputs and outputs.
Similarly, I think it's possible that such circuits might exist for learning to ride bicycles and motorcycles. (I couldn't say *what* circuits - the constant microadjustments we make with our feet and ankles to enable us to stand upright on feet that would otherwise be too small a base to be stable? The way we automatically lean into the wind or into the upwards gradient of a slope? The target-fixation that once helped us chase prey?)
If such circuits do exist within us one way or another, if some large part of the training process is actually about hooking-up existing circuits to new I/O, and if any of these circuits are super-complicated biological-evolution-scale products that we can't just program into AI, it would seem that we have an advantage over the AI in learning to drive entirely separate from any superior data-efficiency?
(I think there are potentially data-efficiency explanations - for example perhaps the student is using their observations of *other people* riding bicycles and motorcycles as training data - all I claim is that "it's data efficiency" doesn't seem anywhere near as certain to me as AIFP present it!)
2) "[Paraphrasing] In every field there are exceptional humans who are hailed as geniuses that represent the limit of human potential, only to be dethroned by subsequent supergeniuses and so on": I don't think AIANT are claiming that there will never be minor, diminishing-returns improvement to AIs of the sort we see with human athletes and intellectuals, where (say) the latest-generation of athletes is able to run a mile 4 seconds faster than the previous one, the next generation is able to run a mile 2 seconds faster, etc. - rather, AIANT is claiming that this sort of convergent-series improvement is possible but unbounded exponential improvement is not, just as human athletes will continue to get faster by smaller and smaller margins but that will never become a runaway* process.
(* Sorry.)
I do think it might be possible, even likely, for recursive self-improvement to make AI intelligence growth exponential and self-sustaining - it's just that all the examples of self-improvement AIFP cites (human intelligence, athleticism, even machine size) actually *do* seem to have a limit somewhere around the current level, just as AIANT describe. USS Nimitz isn't *exponentially* bigger than the Titanic; Einstein wasn't *exponentially* smarter than Newton, etc.
I think a better argument against AIANT here would be to show (if possible!) that AI improvement works *differently* to athleticism, intelligence, machine size, etc.: that the former depends on things like "how many transistors can we fit inside a building" which have a theoretical bound much farther above the current level than things like muscle-density or the human connectome or the tensile strength of steel or whatever.
nb. For machine size, I don't deny that we may eventually have moon-sized space stations and solar-system-sized Dyson spheres and stuff - but I think they will be a discontinuous, entirely separate technology that doesn't depend on the scale of earlier machines. I don't think we'll continuously scale-up our lorries and locomotives and bucket-wheel excavators until we get to Dyson spheres. (But if we did it would be super-freakin'-cool and 8-year-old me would have very strongly approved of this direction for civilisation.)
3) (Very minor irrelevant side-point here...) "AI is the fastest-spreading technology" - maybe, but I don't think chatGPT's "time from launch to to X-users" is evidence of this. Even if we entirely side-step the debate about whether the public launch of chatGPT represents an entirely new technology or a particularly well-marketed release of a long-in-development technology, shouldn't "speed of spread" be given proportional to the overall population rather than given as an absolute number of people?
Otherwise A) some primitive technology like the wheel/language/sharpened stone, which maybe reached total population saturation very quickly and then just spread slowly with population growth, looks much less revolutionary* than it actually may have been, and B) AI may well be overtaken by some trivial future space-yo-yo or superintelligent tamagotchi that spreads through playgrounds of trillions of in-silico schoolkids overnight; this doesn't seem like a good way of framing the relative importance or future relevance of each technology!
(* Especially the wheel. Sorry again.)
(And anyway, where did we collectively get to on that COVID lab-leak theory? Or how fast do genome-edited bacteria multiply? Is it possible that actually the fastest-spreading ever "technology" is some engineered organism?)
1. I agree that people start out with many car-driving-relevant skills. I don't know if I'd say that this is just genetic - I think they're extending pre-existing knowledge about the world. I'm not really sure what knowledge - maybe there's some unexpectedly deep connection between walking and driving, or something [EDIT: also, driving simulation video games, like Mario Kart!]. But I think of this as something like "by the 1000th task, there's a lot of transfer learning from the previous 999 tasks". I expect this to be true of AI as well - at least of the sorts of future AIs that have good data efficiency. If there's some general reasoning AI that we've already taught to "walk" as a humanoid robot, it might have the same advantages learning to drive as a human teenager.
2. I think the issue isn't just that humans improve by small amounts, it's that they improve by the small amount that's proportional to population and the distribution.
So for example, Einstein was *much* better at math/physics than the average person - not just a small amount. But if we analyze the best mathematician in a group of N humans, as we gradually increase N, we'll go all the way from the average person to (when N reaches about the size of the world population) Einstein. To me that suggests that we're limited by some kind of process where we wait for the normal variation within humans to reach a certain level - not by any kind of cosmic speed limit. If we were near the cosmic speed limit, we'd expect clumping near the limit.
3. The chart only shows technologies within the past ~20 years, which hasn't been enough time for population growth to really matter, so I think you could trivially convert the vertical axis to "percent of world population". I don't think there's any chance earlier technologies spread faster - AFAIK the wheel took millennia to spread from one region to another.
1. I agree that it isn't clear the pathways and neural architecture and what-have-you that makes us good at learning to drive (or to walk, or to talk) are genetic - I just think there's a decent argument for their being at-least partially genetic in ways that, if true, would seem to give us a learning advantage over "architecturally-neutral" AIs separately to our respective data efficiencies.
I admit that if we do learn driving quickly mostly because of super-data-efficient skills-transfer and not because our architecture is (accidentally) highly optimised for tasks like driving, once AIs reach some baseline level of motor skills (pun fully intended) then as their data-efficiency approaches ours so would the time it takes them to learn to drive.
2. I agree that if human intelligence had a cosmic limit we'd see clustering close to the limit - but, well, don't we? Doesn't it seem like eg. Aristotle (world population 50 million), Eratosthenes (pop. 60 million), Huygens (pop. 500mn), Euler (pop. 1bn), Gauss (pop. 1.5bn), Einstein (pop. 6bn) and Alexander (pop. 8bn) are probably all in roughly the same sort of general league, intelligence-wise? It certainly doesn't feel like the later geniuses are *so much* smarter than the former ones despite a seriously colossal increase in their populations?
(I do think that eg. Einstein probably was measurably smarter than Eratosthenes - he probably had a better education and less lead in his tea - I just don't think he was 1000-times-the-sample-size smarter...)
If you drew an intelligence scale with a severely developmentally-impaired person at one end and Agent 5 at the other end, would you not expect to see humanity's top geniuses clustered together distinctly irrespective of the size of the populations they came from?
If humanity one day colonises the galaxy and there are quintillions of people, would you expect to see a few of them born naturally smarter than Agent 5?
3. I'm still a bit skeptical that Netflix and even chatGPT count as "technologies" - but I admit you're likely right about earlier technologies spreading more slowly nevertheless. I certainly didn't realise the wheel had such a slow roll-out.
> humanity's top geniuses clustered together distinctly irrespective of the size of the populations they came from?
I don't think this is a good argument until we start bypassing the raw physical limit on brain matter that the pelvis imposes.
When uterine replicators are a thing, and von Neumann the IXth comes out with a 2x head - if *that* genius is still close to Einstein and Newton and Eratosthenes, THEN maybe you're making the argument that there's a limit to intelligence.
As is, we know there's a raw physical limit imposed on all humans.
I agree that "pelvis size" (or more generally "current biologically-possible structures for the human connectome"..) is a likely cause for the limitation on human intelligence, and I agree that engineering people to have bigger heads (that can still fit through a pelvis.. maybe make them torpedo-shaped..?) would raise the limit.
I don't think that torpedo-headed-Von-Neumann* would happen by chance given a big enough sample size, though; I think we're in a local maximum and would need to intervene directly in human physiology if we wanted to make THVN.
(* ...dot Tumblr dot com.)
Of course this isn't a true "cosmic" limit on intelligence - but I'm not claiming that it is, just that our current observations show intelligence (and athleticism and machine size and...) to have diminishing returns and an effective upper limit somewhere vaguely around the current level, regardless of sample size, which I think can be equally well explained by AIANT's theory as by AIFP's.
I do happen to think AIFP is entirely correct that AI won't be subject to the same physical restrictions and so could probably surpass the observed human intelligence limit (just as Torpedo-Headed Von Neumann could..) but I don't think a claim like "peak intelligence depends only on population size and increases without bound as population size increases, therefore there is no observable intelligence limit" is the right way to argue for this.
I’m only 5/7 of the way through this but have a comment on section 1, Scott’s comments on AIANT’s first point. I’ll start with the TL;DR: WTF is up with the AIANT people?
Scott is delightfully smart, but it seems to me his smarts aren’t even’t needed to see what’s wrong with AIANT point 1. All that’s required is for someone to think about the AI use they know of, and apply a little common sense. Seems to me the only way someone could think at this point that AI will be slow to diffuse would be if their picture of how it would diffuse is so rigid that they whenever the think about the question of how widely used it is, they go down the same hinky little list: Is it currently in use for prediction of crime, or by insurance companies for prediction of health? No. Are AI-based devices getting waved past the FDA’s usual supervision of medical devices? No. Is the EU laid back about it? No.
Scott blows their Point 1 to pieces by pointing out a couple dramatic and well-publicized demonstrations of AI’s power to capture the attention of millions, & its use in seats of power. He adds some easy-to-find (and easy to guess) stats about the high level of AI use by various professionals. And then he shows the reader a coupla devastating graphs. Surely the AIANT people already knew all the facts he marshals and have seen many such the graphs. So are their minds so profoundly bureaucratic that it never occurred to them to consider the info scattered all around them that suggests that AI integration into life in the present era has been and will continue to be fast, not slow? Or are they treating this like a debate club debate, where they come up with the best arguments they can even if they are sure they point they’re assigned to defend is wrong? Or, um, — I can’t even think of a third possible explanation.
I'm increasingly updating towards Eliezer Yudkowsky's position, that AI x-risk is so scary for people that they struggle to acknowledge that it's real, and prefer to tell themselves reassuring stories as "cope" instead.
Probably lots of people experienced such denial early on in COVID. Reminding people about this could help motivate them to avoid making the same mistake again. If you weren't sufficiently cognizant of COVID early on, that suggests that you may not be sufficiently cognizant of AI risk.
There appears to be some relevant psychological literature under the keyword "denial". Unfortunately, it seems to target individuals more than groups.
Well, slightly (I don't know what you count as "materially"). There are questions that I can ask of ChatGPT, Claude, and Gemini that are _not_ easy to find by pre-AI web searches. E.g. if I can ask vague questions like "Is there a correction to Coulomb's law once one gets close enough that QED corrections to screening of the "bare" charge matter" I got pointed to https://en.wikipedia.org/wiki/Uehling_potential , which would _not_ have been easy to find by pre-AI web searches, not knowing the name of the potential.
EDIT: Now, I _don't_ mean to imply that the current LLMs are reliable. I've been asking the same set of seven chemistry and physics questions for a year or so (which I would expect a college senior in chemistry to get fully correct), and no "Ph.D.-level" LLM has gotten all of them right. Nonetheless, they are right often enough to be useful, and they _have_ improved a lot over over the year.
"We think that sometime in the next 2 - 10 years, AI will enter a recursive self-improvement loop that ends with models capable enough to render all of their “well it can’t possibly do this” calculations moot."
How would an AI know what to self-improve? Is open-ended recursive self-improvement possible when agency is defined by task completion? Unless I missed something, the timeline and takeoff forecasts assume so but don't seem to justify it.
"Redditors are already telling each other to skip the doctor entirely and go straight to the source. “ChatGPT is a shockingly good doctor”, says one heavily-upvoted post. “Seriously, this is life changing”. "
And there are people who say you should go see the local witch doctor, Scott. So what?
When are you going to seriously confront the fact that most AI maximalists are sad and lonely men who have spent much of their lives reacting to their own low social value, particularly in the sexual marketplace? I keep being told by the increasingly unhinged rationalist set that this is an off limits observation, which is weird, because it's precisely the kind of provocation rationalists used to treat as inherently more serious.
> "most AI maximalists are sad and lonely men who have spent much of their lives reacting to their own low social value, particularly in the sexual marketplace"
Possibly there's some truth in this (certainly is in my case!) but if anything this seems like a pretty good argument for AI being transformative?
The computer, the internet, the motorcar, the bicycle, the aeroplane, the factory, the telecommunications network, like half of modern medicine and science, &c. &c. - think how different civilisation would look if it weren't for weird obsessive socially-maladjusted engineers building freaky stuff nobody asked them for.
Yup, or, at least, "the proponent of this invention is sad and lonely, therefore the invention will not work" is a really lousy way to make a prediction about the invention.
It's always socially undesirable men who struggle in the sexual marketplace who are most intensely drawn to living in fantasy, if that's not clear.
Again, this is PRECISELY the kind of impolite but intuitively relevant argument that rationalists have been complimenting each other for making for a couple decades, but they don't like it because they are now the ones on the other end of the microscope.
> "It's always socially undesirable men who struggle in the sexual marketplace who are most intensely drawn to living in fantasy"
I'm pretty doubtful about this! In my personal experience, I know far more beautiful, charismatic, and sexually-successful women who believe that crystals have magic powers, plants talk to them, and distant celestial bodies essentially run their lives than I know ugly unsuccessful men with similar-level delusions.
...but! Just supposing for a moment that your claim were true:
Suppose the base rate of people drawn to living unproductively in fantasy is 0.1%, and the rate amongst unhappy lonely men is 15% - fine, this would explain your observation. But suppose the base rate of people productively highly-driven to change the world in practical realistic ways is also 0.1%, and the rate amongst unhappy lonely men is another 15%. (This doesn't seem unreasonable - unhappy lonely men do seem to have a particularly strong incentive to want to change the world!) In this way, your observation could be true (though I personally don't think it is) and yet entirely compatible with there being more-than-expected transformative technologies developed by unhappy lonely men.
"The reasonable man adapts himself to the world around him; the unreasonable man persists in trying to adapt the entire world to himself. Hence, all progress depends upon the unreasonable man" --Bernard Shaw
“You know that, and I know that, but does the bear know that?” If people in positions of power and influence are increasingly using it to make decisions and learn about the world, and it’s getting better and better at both of those tasks, with no end in sight to either of those claims, why does it matter what we call it?
You say of AIANT, "They admit that by all metrics, AI research seems to be going very fast. They only object that perhaps it might one day get hidebound and stymied by conformity bias". I think this is misrepresentation of their position. They say,
"The production of AI research has been increasing exponentially, with the rate of publication of AI/ML papers on arXiv exhibiting a doubling time under two years. But it is not clear how this increase in volume translates to progress"
and
"Is the current era different? Although ideas incrementally accrue at increasing rates, are they turning over established ones? The transformer architecture has been the dominant paradigm for most of the last decade, despite its well-known limitations[...] This leads to an 'ossification of canon.' Perhaps this description applies to the current state of AI methods research"
I read this as them saying AI research is already hidebound and stymied by conformity bias, and that it is unclear if it is going very fast, just that there are more AI papers. You have a good objection on the "it's just AI papers" front by citing better metrics in the footnote, but I don't think you really addressed their point about the AI field being too conformist and hidebound to achieve progress.
Overall, though, I really liked this article. Maybe you and the AIANT people could arrange a longform debate or conversation or something.
I don't know if I would call the continued use of the transformer "hidebound"; it keeps working!
This is why we linked the Epoch page on algorithmic progress. It shows that, holding compute constant, AI efficiency doubles every ~year. That's an output, not an input! If AI technology is getting better that quickly, then it's no shame not to have changed one particular part of the paradigm - it suggests that changing that part isn't necessary for rapid gains. It's like pointing to wheeled vehicles - from primitive oxcarts to bullet trains - and saying that it must be "hidebound" because the wheels are still the same shape.
>It's like pointing to wheeled vehicles - from primitive oxcarts to bullet trains - and saying that it must be "hidebound" because the wheels are still the same shape.
That's fair. ( And, at the low level, if someone were to complain that the AI systems were still using field effect transistors, that would be ridiculous. )
Still, I'd be happier if the data efficiency of LLM training were also improving at a rapid clip. It is possible that, e.g. in addition to perceptron layers, and attention layers, there might be one or more additional type of processing layer that would make data efficiency (and possibly incremental learning) much easier. Multi-layer perceptrons are theoretically complete (if wide enough and deep enough), yet attention layers _did_ help.
I think that by-and-large the AI Futures Project have a very strong case for AI as an abnormal technology - but I think there are a couple of flawed claims, maybe even sleights-of-hand, in this post. I hope I can highlight these, purely for the sake of the quality of the debate, without it appearing as though I disagree with AIFP's overall argument (I don't!)
1) "[Paraphrasing] Human teenagers learn to drive within hours because they're incredibly data-efficient":
Having taught adults to ride bicycles and motorcycles, I'm constantly amazed by how natural and intuitive people seem to find these skills. To take a corner on a motorcycle, for instance, you have to move the handlebars in a VERY counterintuitive way (known as counter-steering), tilt your hips and shoulders, and do half a dozen other things - and yet you don't teach any of this, you teach "Look in the direction you want the bike to go, not the direction the bike is currently going, and it will go there" and the student's body naturally does almost-all the right things (the role of the teacher is then to identify the one or two imperfect things and correct these). The student doesn't even realise - and is usually quite skeptical when you tell them - that their body is unconsciously pointing the handlebars left when they make the bike turn right and vice-versa!
I don't think this is easy to explain in terms of data-efficiency alone - after all, the student isn't generalising from a very small number of examples, they're somehow obtaining the right answer despite no examples, no direct instruction, and a very counterintuitive mechanism they clearly can't reason-out the workings thereof.
I think it's possible that, in some sense, people have *always* been able to ride bicycles and motorcycles, without instruction, even before these technologies existed:
Imagine an absurdly sci-fi car of the sort imagined in 1950s retrofuturism, with a bewildering array of knobs, dials, switches, and gauges, but no steering wheel, accelerator, clutch pedal, etc. etc. You would expect that a normal car driver wouldn't be able to drive this car - but if you can show them two buttons that effectively turn an "imaginary" steering-wheel clockwise and counter-clockwise, a knob that represents the angle of an imaginary accelerator pedal, a switch that effectively depresses an imaginary clutch pedal, etc. they might be able to learn to drive your technocar far quicker than they originally learned to drive the first time around - the "how-to-drive circuits" are already in their heads, they're just hooking them up to new inputs and outputs.
Similarly, I think it's possible that such circuits might exist for learning to ride bicycles and motorcycles. (I couldn't say *what* circuits - the constant microadjustments we make with our feet and ankles to enable us to stand upright on feet that would otherwise be too small a base to be stable? The way we automatically lean into the wind or into the upwards gradient of a slope? The target-fixation that once helped us chase prey?)
If such circuits do exist within us one way or another, if some large part of the training process is actually about hooking-up existing circuits to new I/O, and if any of these circuits are super-complicated biological-evolution-scale products that we can't just program into AI, it would seem that we have an advantage over the AI in learning to drive entirely separate from any superior data-efficiency?
(I think there are potentially data-efficiency explanations - for example perhaps the student is using their observations of *other people* riding bicycles and motorcycles as training data - all I claim is that "it's data efficiency" doesn't seem anywhere near as certain to me as AIFP present it!)
2) "[Paraphrasing] In every field there are exceptional humans who are hailed as geniuses that represent the limit of human potential, only to be dethroned by subsequent supergeniuses and so on": I don't think AIANT are claiming that there will never be minor, diminishing-returns improvement to AIs of the sort we see with human athletes and intellectuals, where (say) the latest-generation of athletes is able to run a mile 4 seconds faster than the previous one, the next generation is able to run a mile 2 seconds faster, etc. - rather, AIANT is claiming that this sort of convergent-series improvement is possible but unbounded exponential improvement is not, just as human athletes will continue to get faster by smaller and smaller margins but that will never become a runaway* process.
(* Sorry.)
I do think it might be possible, even likely, for recursive self-improvement to make AI intelligence growth exponential and self-sustaining - it's just that all the examples of self-improvement AIFP cites (human intelligence, athleticism, even machine size) actually *do* seem to have a limit somewhere around the current level, just as AIANT describe. USS Nimitz isn't *exponentially* bigger than the Titanic; Einstein wasn't *exponentially* smarter than Newton, etc.
I think a better argument against AIANT here would be to show (if possible!) that AI improvement works *differently* to athleticism, intelligence, machine size, etc.: that the former depends on things like "how many transistors can we fit inside a building" which have a theoretical bound much farther above the current level than things like muscle-density or the human connectome or the tensile strength of steel or whatever.
nb. For machine size, I don't deny that we may eventually have moon-sized space stations and solar-system-sized Dyson spheres and stuff - but I think they will be a discontinuous, entirely separate technology that doesn't depend on the scale of earlier machines. I don't think we'll continuously scale-up our lorries and locomotives and bucket-wheel excavators until we get to Dyson spheres. (But if we did it would be super-freakin'-cool and 8-year-old me would have very strongly approved of this direction for civilisation.)
3) (Very minor irrelevant side-point here...) "AI is the fastest-spreading technology" - maybe, but I don't think chatGPT's "time from launch to to X-users" is evidence of this. Even if we entirely side-step the debate about whether the public launch of chatGPT represents an entirely new technology or a particularly well-marketed release of a long-in-development technology, shouldn't "speed of spread" be given proportional to the overall population rather than given as an absolute number of people?
Otherwise A) some primitive technology like the wheel/language/sharpened stone, which maybe reached total population saturation very quickly and then just spread slowly with population growth, looks much less revolutionary* than it actually may have been, and B) AI may well be overtaken by some trivial future space-yo-yo or superintelligent tamagotchi that spreads through playgrounds of trillions of in-silico schoolkids overnight; this doesn't seem like a good way of framing the relative importance or future relevance of each technology!
(* Especially the wheel. Sorry again.)
(And anyway, where did we collectively get to on that COVID lab-leak theory? Or how fast do genome-edited bacteria multiply? Is it possible that actually the fastest-spreading ever "technology" is some engineered organism?)
1. I agree that people start out with many car-driving-relevant skills. I don't know if I'd say that this is just genetic - I think they're extending pre-existing knowledge about the world. I'm not really sure what knowledge - maybe there's some unexpectedly deep connection between walking and driving, or something [EDIT: also, driving simulation video games, like Mario Kart!]. But I think of this as something like "by the 1000th task, there's a lot of transfer learning from the previous 999 tasks". I expect this to be true of AI as well - at least of the sorts of future AIs that have good data efficiency. If there's some general reasoning AI that we've already taught to "walk" as a humanoid robot, it might have the same advantages learning to drive as a human teenager.
2. I think the issue isn't just that humans improve by small amounts, it's that they improve by the small amount that's proportional to population and the distribution.
So for example, Einstein was *much* better at math/physics than the average person - not just a small amount. But if we analyze the best mathematician in a group of N humans, as we gradually increase N, we'll go all the way from the average person to (when N reaches about the size of the world population) Einstein. To me that suggests that we're limited by some kind of process where we wait for the normal variation within humans to reach a certain level - not by any kind of cosmic speed limit. If we were near the cosmic speed limit, we'd expect clumping near the limit.
3. The chart only shows technologies within the past ~20 years, which hasn't been enough time for population growth to really matter, so I think you could trivially convert the vertical axis to "percent of world population". I don't think there's any chance earlier technologies spread faster - AFAIK the wheel took millennia to spread from one region to another.
Thanks for the reply!
1. I agree that it isn't clear the pathways and neural architecture and what-have-you that makes us good at learning to drive (or to walk, or to talk) are genetic - I just think there's a decent argument for their being at-least partially genetic in ways that, if true, would seem to give us a learning advantage over "architecturally-neutral" AIs separately to our respective data efficiencies.
I admit that if we do learn driving quickly mostly because of super-data-efficient skills-transfer and not because our architecture is (accidentally) highly optimised for tasks like driving, once AIs reach some baseline level of motor skills (pun fully intended) then as their data-efficiency approaches ours so would the time it takes them to learn to drive.
2. I agree that if human intelligence had a cosmic limit we'd see clustering close to the limit - but, well, don't we? Doesn't it seem like eg. Aristotle (world population 50 million), Eratosthenes (pop. 60 million), Huygens (pop. 500mn), Euler (pop. 1bn), Gauss (pop. 1.5bn), Einstein (pop. 6bn) and Alexander (pop. 8bn) are probably all in roughly the same sort of general league, intelligence-wise? It certainly doesn't feel like the later geniuses are *so much* smarter than the former ones despite a seriously colossal increase in their populations?
(I do think that eg. Einstein probably was measurably smarter than Eratosthenes - he probably had a better education and less lead in his tea - I just don't think he was 1000-times-the-sample-size smarter...)
If you drew an intelligence scale with a severely developmentally-impaired person at one end and Agent 5 at the other end, would you not expect to see humanity's top geniuses clustered together distinctly irrespective of the size of the populations they came from?
If humanity one day colonises the galaxy and there are quintillions of people, would you expect to see a few of them born naturally smarter than Agent 5?
3. I'm still a bit skeptical that Netflix and even chatGPT count as "technologies" - but I admit you're likely right about earlier technologies spreading more slowly nevertheless. I certainly didn't realise the wheel had such a slow roll-out.
> humanity's top geniuses clustered together distinctly irrespective of the size of the populations they came from?
I don't think this is a good argument until we start bypassing the raw physical limit on brain matter that the pelvis imposes.
When uterine replicators are a thing, and von Neumann the IXth comes out with a 2x head - if *that* genius is still close to Einstein and Newton and Eratosthenes, THEN maybe you're making the argument that there's a limit to intelligence.
As is, we know there's a raw physical limit imposed on all humans.
I think we're in complete agreement here!
I agree that "pelvis size" (or more generally "current biologically-possible structures for the human connectome"..) is a likely cause for the limitation on human intelligence, and I agree that engineering people to have bigger heads (that can still fit through a pelvis.. maybe make them torpedo-shaped..?) would raise the limit.
I don't think that torpedo-headed-Von-Neumann* would happen by chance given a big enough sample size, though; I think we're in a local maximum and would need to intervene directly in human physiology if we wanted to make THVN.
(* ...dot Tumblr dot com.)
Of course this isn't a true "cosmic" limit on intelligence - but I'm not claiming that it is, just that our current observations show intelligence (and athleticism and machine size and...) to have diminishing returns and an effective upper limit somewhere vaguely around the current level, regardless of sample size, which I think can be equally well explained by AIANT's theory as by AIFP's.
I do happen to think AIFP is entirely correct that AI won't be subject to the same physical restrictions and so could probably surpass the observed human intelligence limit (just as Torpedo-Headed Von Neumann could..) but I don't think a claim like "peak intelligence depends only on population size and increases without bound as population size increases, therefore there is no observable intelligence limit" is the right way to argue for this.
I’m only 5/7 of the way through this but have a comment on section 1, Scott’s comments on AIANT’s first point. I’ll start with the TL;DR: WTF is up with the AIANT people?
Scott is delightfully smart, but it seems to me his smarts aren’t even’t needed to see what’s wrong with AIANT point 1. All that’s required is for someone to think about the AI use they know of, and apply a little common sense. Seems to me the only way someone could think at this point that AI will be slow to diffuse would be if their picture of how it would diffuse is so rigid that they whenever the think about the question of how widely used it is, they go down the same hinky little list: Is it currently in use for prediction of crime, or by insurance companies for prediction of health? No. Are AI-based devices getting waved past the FDA’s usual supervision of medical devices? No. Is the EU laid back about it? No.
Scott blows their Point 1 to pieces by pointing out a couple dramatic and well-publicized demonstrations of AI’s power to capture the attention of millions, & its use in seats of power. He adds some easy-to-find (and easy to guess) stats about the high level of AI use by various professionals. And then he shows the reader a coupla devastating graphs. Surely the AIANT people already knew all the facts he marshals and have seen many such the graphs. So are their minds so profoundly bureaucratic that it never occurred to them to consider the info scattered all around them that suggests that AI integration into life in the present era has been and will continue to be fast, not slow? Or are they treating this like a debate club debate, where they come up with the best arguments they can even if they are sure they point they’re assigned to defend is wrong? Or, um, — I can’t even think of a third possible explanation.
I'm increasingly updating towards Eliezer Yudkowsky's position, that AI x-risk is so scary for people that they struggle to acknowledge that it's real, and prefer to tell themselves reassuring stories as "cope" instead.
Probably lots of people experienced such denial early on in COVID. Reminding people about this could help motivate them to avoid making the same mistake again. If you weren't sufficiently cognizant of COVID early on, that suggests that you may not be sufficiently cognizant of AI risk.
There appears to be some relevant psychological literature under the keyword "denial". Unfortunately, it seems to target individuals more than groups.
Your life has not materially changed because of AI.
Well, slightly (I don't know what you count as "materially"). There are questions that I can ask of ChatGPT, Claude, and Gemini that are _not_ easy to find by pre-AI web searches. E.g. if I can ask vague questions like "Is there a correction to Coulomb's law once one gets close enough that QED corrections to screening of the "bare" charge matter" I got pointed to https://en.wikipedia.org/wiki/Uehling_potential , which would _not_ have been easy to find by pre-AI web searches, not knowing the name of the potential.
EDIT: Now, I _don't_ mean to imply that the current LLMs are reliable. I've been asking the same set of seven chemistry and physics questions for a year or so (which I would expect a college senior in chemistry to get fully correct), and no "Ph.D.-level" LLM has gotten all of them right. Nonetheless, they are right often enough to be useful, and they _have_ improved a lot over over the year.
Has Arvind or Sayash responded, or are they expected to respond to this?
"We think that sometime in the next 2 - 10 years, AI will enter a recursive self-improvement loop that ends with models capable enough to render all of their “well it can’t possibly do this” calculations moot."
How would an AI know what to self-improve? Is open-ended recursive self-improvement possible when agency is defined by task completion? Unless I missed something, the timeline and takeoff forecasts assume so but don't seem to justify it.
"Redditors are already telling each other to skip the doctor entirely and go straight to the source. “ChatGPT is a shockingly good doctor”, says one heavily-upvoted post. “Seriously, this is life changing”. "
And there are people who say you should go see the local witch doctor, Scott. So what?
When are you going to seriously confront the fact that most AI maximalists are sad and lonely men who have spent much of their lives reacting to their own low social value, particularly in the sexual marketplace? I keep being told by the increasingly unhinged rationalist set that this is an off limits observation, which is weird, because it's precisely the kind of provocation rationalists used to treat as inherently more serious.
Even granting that most AI maximalists are sad and lonely men with low sexual marketplace value, the author of *this* post is married with kids!
> "most AI maximalists are sad and lonely men who have spent much of their lives reacting to their own low social value, particularly in the sexual marketplace"
Possibly there's some truth in this (certainly is in my case!) but if anything this seems like a pretty good argument for AI being transformative?
The computer, the internet, the motorcar, the bicycle, the aeroplane, the factory, the telecommunications network, like half of modern medicine and science, &c. &c. - think how different civilisation would look if it weren't for weird obsessive socially-maladjusted engineers building freaky stuff nobody asked them for.
Yup, or, at least, "the proponent of this invention is sad and lonely, therefore the invention will not work" is a really lousy way to make a prediction about the invention.
It's always socially undesirable men who struggle in the sexual marketplace who are most intensely drawn to living in fantasy, if that's not clear.
Again, this is PRECISELY the kind of impolite but intuitively relevant argument that rationalists have been complimenting each other for making for a couple decades, but they don't like it because they are now the ones on the other end of the microscope.
> "It's always socially undesirable men who struggle in the sexual marketplace who are most intensely drawn to living in fantasy"
I'm pretty doubtful about this! In my personal experience, I know far more beautiful, charismatic, and sexually-successful women who believe that crystals have magic powers, plants talk to them, and distant celestial bodies essentially run their lives than I know ugly unsuccessful men with similar-level delusions.
...but! Just supposing for a moment that your claim were true:
Suppose the base rate of people drawn to living unproductively in fantasy is 0.1%, and the rate amongst unhappy lonely men is 15% - fine, this would explain your observation. But suppose the base rate of people productively highly-driven to change the world in practical realistic ways is also 0.1%, and the rate amongst unhappy lonely men is another 15%. (This doesn't seem unreasonable - unhappy lonely men do seem to have a particularly strong incentive to want to change the world!) In this way, your observation could be true (though I personally don't think it is) and yet entirely compatible with there being more-than-expected transformative technologies developed by unhappy lonely men.
"The reasonable man adapts himself to the world around him; the unreasonable man persists in trying to adapt the entire world to himself. Hence, all progress depends upon the unreasonable man" --Bernard Shaw
This slop is basically a hallucination. Language models are not AI.
“You know that, and I know that, but does the bear know that?” If people in positions of power and influence are increasingly using it to make decisions and learn about the world, and it’s getting better and better at both of those tasks, with no end in sight to either of those claims, why does it matter what we call it?
https://open.substack.com/pub/notgoodenoughtospeak/p/my-chat-with-chatgpt-abridged?r=3zzc32&utm_medium=ios
A bit of a nitpick.
You say of AIANT, "They admit that by all metrics, AI research seems to be going very fast. They only object that perhaps it might one day get hidebound and stymied by conformity bias". I think this is misrepresentation of their position. They say,
"The production of AI research has been increasing exponentially, with the rate of publication of AI/ML papers on arXiv exhibiting a doubling time under two years. But it is not clear how this increase in volume translates to progress"
and
"Is the current era different? Although ideas incrementally accrue at increasing rates, are they turning over established ones? The transformer architecture has been the dominant paradigm for most of the last decade, despite its well-known limitations[...] This leads to an 'ossification of canon.' Perhaps this description applies to the current state of AI methods research"
I read this as them saying AI research is already hidebound and stymied by conformity bias, and that it is unclear if it is going very fast, just that there are more AI papers. You have a good objection on the "it's just AI papers" front by citing better metrics in the footnote, but I don't think you really addressed their point about the AI field being too conformist and hidebound to achieve progress.
Overall, though, I really liked this article. Maybe you and the AIANT people could arrange a longform debate or conversation or something.
I don't know if I would call the continued use of the transformer "hidebound"; it keeps working!
This is why we linked the Epoch page on algorithmic progress. It shows that, holding compute constant, AI efficiency doubles every ~year. That's an output, not an input! If AI technology is getting better that quickly, then it's no shame not to have changed one particular part of the paradigm - it suggests that changing that part isn't necessary for rapid gains. It's like pointing to wheeled vehicles - from primitive oxcarts to bullet trains - and saying that it must be "hidebound" because the wheels are still the same shape.
>It's like pointing to wheeled vehicles - from primitive oxcarts to bullet trains - and saying that it must be "hidebound" because the wheels are still the same shape.
That's fair. ( And, at the low level, if someone were to complain that the AI systems were still using field effect transistors, that would be ridiculous. )
Still, I'd be happier if the data efficiency of LLM training were also improving at a rapid clip. It is possible that, e.g. in addition to perceptron layers, and attention layers, there might be one or more additional type of processing layer that would make data efficiency (and possibly incremental learning) much easier. Multi-layer perceptrons are theoretically complete (if wide enough and deep enough), yet attention layers _did_ help.