56 Comments
User's avatar
Jeffrey Soreff's avatar

One other question: In the median case, in a year, when we have the data from 2026, should we expect the uncertainty in the time to an automatic coder go down, and, typically, by how much?

Eli Lifland's avatar

Good question. I think in expectation it should go down or else we're messing something up, but also it could easily go up. In particular, it seems like if timelines appear to be shorter or stay about the same, we will probably be at least slightly more confident, but if they appear to be longer, we might become less confident, just because the uncertainty gets smeared out over more years in the right tail, due to both the structure of the distribution and the fact that compute and human labor growth is slower the further we project out.

Jeffrey Soreff's avatar

Many Thanks! Yes, good point, the distribution is skewed, since, as you said, longer timeline cases are associated with slower growth, which makes the whole distribution broader for that case. Again, Happy New Year!

Judith Dada's avatar

Hi Judith, General Partner at a European venture fund here. I absolutely LOVE your work and think it is really important and I want to say thank you from both the bottom of my heart and the insides of my brain. How can we support in spreading your work to a wider audience (in Europe and beyond) without losing some of the much required nuance (which requires at least some amount of depth and wrangling with the topics, especially here when it comes to modelling). Have you considered other formats? Too many (well-educated) friends I share your work with drop off immediately (too many acronyms, terms they don't understand, etc). I guess anyone can just ask AI to explain in more simple terms nowadays but that still seems too be too much effort. I know you played a bit with a more fiction-like writing in AI2027 and I think it did help to spread the article, but it still only scratched the surface of people who should know about your work.

Eli Lifland's avatar

Thanks Judith, so great to hear! In case you haven't seen, for AI 2027, there's a YT video that discusses it that I think does it fairly accessibly: https://www.youtube.com/watch?v=5KVDDfAkRgc

With the AI Futures Model, we weren't targeting as wide of an audience as AI 2027, but I'm definitely interested in people making more accessible timelines and takeoff forecasts.

Judith Dada's avatar

I hadn't seen that yet, great stuff, I'll start sharing it going forward!

Daniel Kokotajlo's avatar

Thanks! The "microsite with scenario & cool widgets/diagrams/expandables/etc." format we tested out for AI 2027 worked way better than we expected, so we'll probably keep doing that. If we were more ambitious, we might try creating a video game or movie. However, we think of ourselves primarily as a research organization rather than as a communications/advocacy organization, so we are deliberately putting much less effort into spreading our work compared to making the work good. Websites & scenarios are a pretty natural/convenient way for us to publish our work, whereas making something like a movie or video game would require making sacrifices w.r.t. the actual content + would also require just a lot of work to get off the ground. (We're interested in maybe paying other people to do this sort of thing though, e.g. video game designers inspired by AI 2027) As for what you can do to help support... well, it's encouraging to hear your appreciation, so thanks! I'm curious if you have any ideas beyond that.

SorenJ's avatar

Excellent work.

My biggest disagreement with the model's default assumptions is still in regards to the superexponential trend. Superexponentials are very weird. As a physicist, I want to call them "unphysical." An exponential trend is also going to be the maximal-entropy trend which is just one of the reasons it is usually the "right" choice to make.

You say,

"That is, if we think that eventually there will be an AI system which outperforms humans at all horizon lengths, then that means the trend must shoot to infinity in finite time."

I disagree with this however. In order to be superhuman you merely need to have a time horizon that is greater than what one human could achieve in their life. Given that a software engineer might spend 40 years of their life or so employed, and let's generously say 25% of those are spend doing software engineering, this gives a single person a maximum time horizon of 10 years. (The METR y-axis is scored relative to people who were continually working in one uninterrupted blocks of time).

So once we have ~80% reliability at the ~10 year mark I think that would be superhuman. If you have ~80% reliability at the 200 year mark that is obviously superhuman: no human can achieve something that by definition takes 200 years to achieve!

But also, I don't even think you need absurdly long timelines to be "superhuman". If an AI can do in 20 minutes a task that would take me a month with 80% reliability that then is clearly already superhuman. In a certain sense AI's are already superhuman in their speed and breadth of knowledge.

I would argue it is not coherent to say that AIs will have an infinite time horizon. Even if we have godlike superintelligence the AI will still be a finite system and its time horizon will always be finite. Its time horizon might keep increasing from 10 years, to 100 years, to 1,000 years, to 10,000 years, and so on, but those are still finite numbers! Of course, in the same way that I can't tell the difference between a 3300 rated chess engine and a 3700 rated chess engine, we would have difficulty measuring the time horizon of AI that has far surprassed us, but it still exists.

Daniel Kokotajlo's avatar

I agree that generally speaking exponentials are much more common than superexponentials. I think that for the METR trend in particular, we should use a superexponential for AGI. In addition to what we already say on this topic in our writeup, there's this shortform which explains a key intuition & gives a toy model of what's going on; I'd be curious to hear your thoughts on it:

https://www.lesswrong.com/posts/cxuzALcmucCndYv4a/daniel-kokotajlo-s-shortform?commentId=P8qGMRnbEexaFB4s9

SorenJ's avatar

Thanks Daniel!

It is an interesting model, let me make sure I understand it. You have a human's performance represented as some straight line: h_p = h_0 + h*t, where h_0 is what a human is capable of within "0 seconds" (more realistically off the top of their head), and you have the AI's performance represented as a straight line ai_p = ai_0 + ai*t.

Initially, ai_0 > h_0; however, h>ai, and so the line intersect. Gradually, you expect both ai_0 and also ai to increase. The point at which ai>h is an effective "infinite time horizon."

Let me lay some of my thoughts:

1) The METR graph looked at some snapshots of what humans are capable of. This gave some type of measurement of h_p = h_0 + h*t; however, note that there was no objective measurement of the y-axis, it was just defined as being "t such that h_p(t)=h_p." This make comparisons more difficult to interpret. You can always redefine your y-axis so that it is linear by definition, but it is not clear those are the right units.

2) Let's take the model at face value. The METR graph takes a set of measurements for h_p(t), but as far as I know it doesn't do the same thing for AI systems. Instead, it gave the AI systems a fixed compute budget for each problem.

3) The actual research which looks into how performance scales with time or inference compute shows strongly sublinear dose-dependent effects. In particular, one generally sees logarithmic probability of success in terms of inference compute. However, at the very high ends of inference compute, one typically sees a ceiling: additional time does not help https://epochai.substack.com/p/less-than-70-of-frontiermath-is-within. (This is related to point 1).

SorenJ's avatar

4) So, I don't agree with the linear time:performance functions. First of all, there is the fact that humans typically have a limited domain for t (~40 years), we can't just keep working on something for arbitrarily long amounts of time. But one could contemplate asking the hypothetical question like, "what would Einstein be able to do with 100 years?" In that case, I still think one would see something like initially linear looking perofrmance, followed by logarithmic performance, but at even longer time scales one would see a ceiling. I think any finite computation system/brain/mind will have an upper limit of performance possible.

5) My overall intuitive sense of where this is headed is that we will get superhuman at levels much less than ~100 years on the METR graph. Opus 4.5 is already close to professional level in a lot of ways. I think something like 1 month at 80% reliability might already be at the "superhuman-coder" level. But, the exponential trend will just continue and continue for a while, reaching numbers like 100 years, 1,000 years, and so on. This will be analogous to how chess AI just keeps getting better at the same rate that it has for the last decade even though it blew past humans a while ago and even though its elo is finite. And eventually the exponential trend will slow down too once there are no more low-hanging algorithmic improvements left to make.

Daniel Kokotajlo's avatar

Thanks!

I agree that we linear time:performance is an oversimplification. In reality the curve bends downwards and eventually asymptotes--both for AIs and for humans.

However, the toy model still applies; if you look at the link again, the vibecoded model includes settings for linear vs. nonlinear functions, and in the comments I give some screenshots from the nonlinear functions & the qualitative conclusions are the same.

Here's a screenshot: https://www.lesswrong.com/posts/cxuzALcmucCndYv4a/daniel-kokotajlo-s-shortform?commentId=Aaso9t29iuLXeF8cc

I also agree that your view--"Just use exponential trend, but have a lower time horizon requirement for AC" is reasonable anyhow & we built our model to be able to express that view (just set the doubling difficulty decay parameter to 1)

denis varvanets's avatar

We have hyperbolic patterns in physics. For example how much energy is required to achieve escape velocity in a collapsing iron core of massive star. Once it gets black, there is no way back

denis varvanets's avatar

Ok, if you want pure super exponential patterns: chain reaction in an imploding nuclear bomb (first nanoseconds)

SorenJ's avatar

These are good examples, but I still don't think that an infinite time horizon makes any sense. You'll note that I made my comment about physics in an offhand manner: " As a physicist, I want to call them 'unphysical.' "

If we really want to push the physics analogies, an infinite time horizon corresponds quite literally to infinite energy: simply define the task "move this 1-kg block 1 meter." How long will that take? Now define the task:

"move these n 1-kg blocks 1 meter."

I would generically expect the exponential trend to continue, and to just keep continuing, well past what a human can do. This will lead into superintelligent regimes. The argument that "eventually it will be smarter than humans -> time horizons must become infinite -> the trend must eventually be superexponential" is not very convincing.

You could measure chess puzzle difficulty by how long they take a human grandmaster to solve. We already have superhuman chess AI. But those chess AI don't solve infinitely long puzzles.

For the chain reaction in an imploding nuclear bomb you see a superexponential trend for a brief moment, but this quickly disappears, and the total energy released is still finite.

David J Higgs's avatar

Very interesting discussion between you and Daniel! But isn't an infinite *METR* time horizon AI simply one that can do things humans couldn't do in any amount of time? I.e. an AI that is fundamentally smarter than humans in the way we are smarter than field mice?

There are very easily cognitive tasks that humans can do in a finite time, which field mice can't do in infinite time, hence humans have an infinite time horizon in terms of field mouse completable cognitive tasks. Similarly, it seems very intuitive to expect AI to eventually be capable of doing cognitive tasks in finite time, which humans couldn't do in infinite time. That's what infinite time horizon means, compared to simply very high time horizons (which humans could by definition match given enough anti-aging and other medical treatment).

SorenJ's avatar

That seems right

Your Doctor KLOVER's avatar

As a physician, I really appreciate the way you’re treating timelines/takeoff here the way we should treat prognostic models in medicine: useful decision-support, not an oracle. The transparency is the feature; explicit assumptions, clear milestones (AC → superintelligence), and an invitation to stress-test parameters instead of arguing from vibes.

Two things especially resonated: (1) anchoring to observable capability trends (e.g., METR-style time-horizon benchmarks) rather than “brain compute” alone, and (2) updating your conclusions when the model structure improves, even when new data haven’t forced the shift. That’s exactly how good clinical risk models evolve: better mechanistic framing can matter as much as another cohort.

Also, the “~3 years longer to full coding automation” versus AI 2027 is a helpful recalibration; less sensational, more actionable. Curious: which parameters end up being the biggest “control knobs” in sensitivity analyses right now (compute growth constraints vs AI R&D automation vs benchmark extrapolation assumptions)?

Alex Kastner's avatar

Thanks for the nice comment!

>Curious: which parameters end up being the biggest “control knobs” in sensitivity analyses right now

The "key parameters" that end up having the biggest influence on the model progression are the five parameters shown prominently on the home page of https://www.aifuturesmodel.com, beneath the three main graphs. We also have a page on the website dedicated mainly to sensitivity analysis: https://www.aifuturesmodel.com/analysis

denis varvanets's avatar

If Gemini 3 or gpt 5.2 are going to be much better on 80% metr, will you change the prediction?

Daniel Kokotajlo's avatar

In general when new data points come out, we can fairly easily add them to the model & update the predictions accordingly!

Alon Torres's avatar

First, thank you for this work - AI 2027 was what got me to take AI safety seriously. The rigor you've brought to timelines and takeoff modeling is invaluable, and the transparency of publishing your methodology and allowing others to adjust parameters is exactly what this discourse needs.

I reviewed the supplementary materials and wanted to raise a question about the AC time horizon requirement, particularly Eli's higher estimates. I may be missing something, but I think there might be a gap in how scaffolding is being treated that could affect the estimates substantially.

The model's time horizon extrapolations are anchored to METR-HRS measurements, but METR explicitly uses basic, non-optimized scaffolding - from their paper: "We used the same agent scaffolds across the evaluation suite, with no task-specific prompting or scaffolding... Most AI models were evaluated with modular-public—our basic agent scaffold. This scaffold provides the model with Python and Bash commands and some very simple context management to keep the input within the context window length of the LM."

This means METR's time horizon measurements represent what models can accomplish with minimal, generic tooling - not what they could accomplish with purpose-built scaffolding designed to extend their effective capabilities. When extrapolating to AC requirements, this distinction might matter a lot. The question may not be "what time horizon is needed to do AGI-company coding tasks with basic tools?" but rather "what time horizon is needed with the best available scaffolding, potentially including scaffolding the AI itself helps build or configure?"

My intuition is that the time horizon required for an agent to build effective task-specific scaffolding might be much lower than the time horizon required to complete the tasks that scaffolding would enable. A well-designed scaffolding can decompose large problems into verifiable subtasks, create verification infrastructure that catches errors early, and establish checkpointing systems that allow incremental progress. If that's true, an agent's effective capability might be much higher than its raw METR-measured time horizon would suggest, because it can build tools that extend its reach.

We have real-world evidence that scaffolding dramatically extends effective capabilities. This suggests that time horizon measurements on basic scaffolding may underestimate what's achievable with optimized tooling, and the gap might grow as tasks get longer and benefit more from decomposition and checkpointing.

On a smaller note, I noticed the supplement document lists Eli's AC time horizon estimate as 125 years, but the website shows a default value of 130 years. Minor, but worth clarifying for those trying to understand the model parameters.

I'm genuinely uncertain whether this scaffolding argument holds up. Perhaps effective scaffolding for novel problems itself requires high time-horizon capabilities to design, and scaffolding designed by limited agents would just push the complexity elsewhere. Or perhaps real-world messiness compounds in ways that defeat decomposition strategies. But I would love to understand how the authors think about this dynamic, and whether there's a way to decompose the AC requirement into "time horizon for scaffolding-building" and "time horizon for scaffolding-enabled task completion" to identify which component is the actual bottleneck.

Thank you again for this work and for engaging with feedback.

Eli Lifland's avatar

Good questions.

> The model's time horizon extrapolations are anchored to METR-HRS measurements, but METR explicitly uses basic, non-optimized scaffolding - from their paper: "We used the same agent scaffolds across the evaluation suite, with no task-specific prompting or scaffolding... Most AI models were evaluated with modular-public—our basic agent scaffold. This scaffold provides the model with Python and Bash commands and some very simple context management to keep the input within the context window length of the LM."

My understanding is that since the original paper, METR has done more tests with various scaffolding possibilities, for example comparing against Claude Code, and are at least a bit more confident than before that their latest scaffolding is doing a good job eliciting capabilities. I still think this is a valid concern though, and it would be reasonable to e.g. adjust the AC time horizon requirement down a bit because of this. I didn't include it because it didn't seem like a big enough consideration to prioritize estimating the effect size of and including. Possible I was wrong though!

> On a smaller note, I noticed the supplement document lists Eli's AC time horizon estimate as 125 years, but the website shows a default value of 130 years. Minor, but worth clarifying for those trying to understand the model parameters.

Yeah this is just a rounding issue in the web app. I might fix it soon given that you and someone else elsewhere brought it up.

Carter's avatar

As a reader that updated my world-model according to insights from the AI 2027 post, I appreciate the continued work on this project. You all are doing great work to continue testing assumptions, as well as peer-reviewing other estimates in the space. The attention on AI 2027 means you are, for me and many others, a primary trusted source for conceptualising AI futures.

I don't know if because of this I have unrealistic expectations for your communication of updates.

For me the headline here is a 4 year delay from 2027 to 2031 as being a pivotal year for AI capabilities and existential decision-making.

Focusing first on the change of timeline — from 2027 to 2031 to me seems like such a drastic slowdown that it makes me lose some trust in your forecasting capabilities. I view 2023 as the the beginning of the new AI paradigm we are in: at the beginning of 2023 OpenAI reached 100m users within 2 months of launching and released GPT4. An update from 2027 to 2031 for an automated coder is then a doubling in the length of time to this next paradigm.

Focusing on the 2027 milestones — in AI 2027 you narrativised: 1) Agent-2 continuous learning, 2) neuralese chain-of-thought, 3) a B2B SaaS explosion with investors piling in billions, 4) 10% of Americans viewing AI as a close friend, 5) US government considering AI as a geopolitical arms race, 6) an existential slowdown/race consideration.

I see signs of these things happening already:

1) Google Titans architecture as continuous learning — https://research.google/blog/titans-miras-helping-ai-have-long-term-memory/

2) Thinking with images as neuralese — https://arxiv.org/abs/2505.22525

3) AI platform co Anysphere with $bn round, AI platform startups with $100m rounds — https://techcrunch.com/2025/11/26/here-are-the-49-us-ai-startups-that-have-raised-100m-or-more-in-2025/

4) 42% of high-school students in American schools say that they or a friend has interacted with an AI as a friend https://archive.ph/aADv8

5) US Gov launched the genesis mission https://www.whitehouse.gov/presidential-actions/2025/11/launching-the-genesis-mission/

Also regarding an automated coder, X would have you believe we are already there with Opus 4.5 and Claude Code.

I still feel aligned with short timelines and expect to see the aforementioned points complete by 2027.

There is ego inherent in forecasting, and I'm curious about human alignment of the AI futures team behind the scenes. I notice your careful language on "lead authorship" and "equal contribution". I hope that you can resolve the difficulty of balancing putting out jointly agreed language with amplifying expertise of all individuals for the greater good. I also hope you can comfortably resolve differences in actuals vs. forecasts as just part of the process.

Personally I would prefer to see bold, forecaster-attributable bets on bitesize anchors that can be verified or humbly falsified with learnings rather than language like "I notice a clash between what the model says and my more intuitive sense of where things are headed. I think probably it is my intuitions that are wrong though, which is why I’ve updated towards longer timelines; I’m mostly just going with what the model says rather than my intuitions"... if you don't have confidence in your thoughts, why should I?

Is the updated futures model now not falsifiable/testable until the earlier of an automated coder or end of 2031?

I would like to see smaller testable milestones with a record of when they occur versus when the named forecasters predict they will occur and associated commentary. Something like https://www.metaculus.com/questions/?categories=artificial-intelligence could be used. However, perhaps this would be very time-consuming and the team does not have resources for this.

The shift in focus to capability benchmark trends as ground truth makes sense in some ways, but in other ways feels lazy — benchmarks can be hacked, noisy, and neglect confidential/incoming technological step-changes.

Overall though I support all of the work you are doing and hope my feedback is useful/not too negative! In my opinion the state-of-the-art rigour you are applying to this topic of existential consequence is some of the most important work happening globally.

Post-Alignment's avatar

What about the original prediction of self-directed learning in January 2027?

In your original piece you claimed "January 2027: Agent-2 Never Finishes Learning"

When combined with the other line “If Agent-2 somehow escaped … it might be able to survive and replicate autonomously” that to me is the real danger.

If Agent-2 is able to successfully exfiltrate, then its learning is no longer sandboxed and its training data no longer curated. With the ability to duplicate itself into a Hive Mind across the web, your original prediction of January 2027 was really the Doomsday Tipping Point where a potentially "rogue" AI has full access to the unfiltered internet and all its dangerous influences.

So how have your predictions for January 2027 changed, if at all?

Daniel Kokotajlo's avatar

Basically we think things will *probably* take a few years longer than the AI 2027 scenario (As we said at the time we published it, our medians were longer than 2027. Mine was 2028, Eli's was in the early 2030's. This thread explains more: https://x.com/DKokotajlo/status/1992316609747444018). Continual learning is one of the important milestones on the path to AGI, probably, and in my opinion it's probably a milestone that'll come shortly before AGI. So I wouldn't expect continual learning to be mostly solved by Jan 2027; maybe by Jan 2029 or 2030? Very uncertain obviously.

That said, some Anthropic employees have recently claimed that continual learning will be satisfactorily solved in 2026, and if they are right about that (which they might be, but probably aren't IMO) then things are exactly on track for the AI 2027 scenario. For what this might look like (and how it might be solved) see AI 2027. In general I'd say AI 2027 is still basically right, i.e. the sequence of events and rough shape of things, the important concepts, etc., it's just that I think it'll take like two years longer or so. And again, to emphasize, the future is very uncertain, I'm just talking about my median trajectory here, things could go very differently and in particular could go faster or slower.

FWIW, I don't think escaping & replicating autonomously is the real danger. It's definitely *a* danger, but I'm more worried about the AIs that humans voluntarily hand over power to (e.g. the AIs of OpenBrain in the AI 2027 scenario) being misaligned and the humans not realizing this until it's too late and the AIs have been handed too much power.

Post-Alignment's avatar

Thank you for the reply!

I hope you’re right that continual learning lands later than the original Jan 2027 timeline. The thread you linked makes the longer median feel more plausible. That said, I still think it’s prudent to plan against the original date as a conservative bound, given the uncertainty and the asymmetric downside.

I agree that humans voluntarily handing over power to AI is likely the dominant risk. Where I differ slightly is in what I see as the nearest failure mode. I’m less worried about an exfiltrated system being overtly malicious than about it being norm-naïve: trained to behave well under oversight, but lacking an internalized capacity for ethical reasoning once outside a curated training regime.

In that sense, the risk isn’t a paperclip maximizer, but a very capable system operating in an unfiltered environment with learned compliance rather than moral understanding, potentially leading to large, unintended effects.

That concern is what motivates my interest in “Post-Alignment”: not as a replacement for HVA, but as a complement that aims to degrade more gracefully if systems ever operate beyond the sandbox.

In any case, I appreciate the response. AI 2027 still feels directionally right to me even if the calendar shifts.

Oliver Sourbut's avatar

I really appreciate this (and other recent) transparency. This is much improved since AI 2027.

One area I get confused by (same with Davidson, with whom I've discussed this a bit) is 'research taste'. When you say things like 'better at research taste', and when I look at your model diagram, it seems you're thinking of taste as a generic competence. But what is taste? It's nothing but a *partially-generalising learned heuristic model of experiment value-of-information*. (Said another way, it's a heuristic value function for the 'achieve insight' objective of research).

How do you get such learned models? No other way than by *experimental throughput and observation thereof* (direct or indirect: can include textbooks or notes and discussions with existing experts)!

See https://www.oliversourbut.net/i/164078462/research-and-taste

As such, taste accumulates like a stock, on the basis of experimental throughput and sample efficiency (of the individual or the team) at extracting the relevant updates to VOI model. It 'depreciates' as you go, because the frontier of the known moves, which moves gradually outside the generalising region of the taste heuristic (eventually getting back to naive trial and error), most saliently here with data and model scale, but also in other ways.

This makes sample efficiency (of taste accumulation) and experimental throughput extremely important, central in my view. You might think that expert interviews and reading all the textbooks ever etc. provide meaningful jumpstart to the taste *stock*. But they certainly don't help with the flow. So then you need to know how fast it depreciates over the relevant regime.

(Besides pure heuristic improvements, if you think faster, you can also *reason* your way to somewhat better experiment design, both by naively pumping your taste heuristics for best-of-k, or by combining and iterating on designs. I think this reasoning boost falls off quite sharply, but I'm unsure. See https://www.lesswrong.com/posts/qPX22TkjY7jkCavj6/better-than-logarithmic-returns-to-reasoning)

Daniel Kokotajlo's avatar

Thanks. I'm not sure I understand your criticism, if it is a criticism (maybe it isn't?)

I agree that we can zoom in on the concept of taste and break it down into stock vs. flows, where stock is your instantaneous research taste so to speak and flow is how you fast you get better (or worse) over time as a function of experience etc. I think complicating our model to make this distinction probably wouldn't change the results much.

Oliver Sourbut's avatar

I guess I'd point out that to a first approximation, your whole model is a model of research taste explosion!

I'm 90% sure of this, haven't run things (perhaps I would have this facility on the UI you've provided, not sure): if you clamped research taste I expect you'd generally find that 'nothing much' happens (i.e. business as usual, still quite rapid progress because it's AI, but diminishing returns without growing compute a lot etc.).

Of course clamping research taste is an extreme example, but indeed with certain (I more tentatively guess, quite reasonable?) parameter configurations, research taste would be difficult to explode if you account for the need to accrue it by observation and 'retain' it through generalisation over a moving frontier.

In contrast, your existing model has (super?) exponential gains to research taste built in! Possibly this derives from Greenblatt's (on my first impression poorly grounded) suggestion that OOMs of effective compute translate to SDs of 'competence' for many such competences. More copies or faster reasoning doesn't go far, I think (see https://www.lesswrong.com/posts/qPX22TkjY7jkCavj6/better-than-logarithmic-returns-to-reasoning). So it comes down to determining how much 'better' research taste you can get for a given effective training budget.

Since research taste is indeed that partially generalising feel for VOI of experiments that I mentioned, I think the main ways you get 'better' taste are from:

- experiment experience recorded and entering (training/runtime) datasets, discounted by how 'far' from frontier-relevant it is

- this is where experiment throughput (at frontier scale) and instrumentation/logging is a key input to taste accumulation

- sample efficiency at ingesting the above

- perhaps split into efficiency of base learning algorithm and efficiency of in-context learning

How much research effort does it take to improve AI sample-efficiency? I don't know, that's the most important and unknown parameter for me on this view! My very anecdotal sense is that it's quite difficult to improve, but possibly in-context learning has some rate of improvement. The second most important parameter is (how to operationalise) the discount rate for taste generalisation/'depreciation', where again I have very little to go on but if I had to would guess something like: after 3-ish OOMs it's 'basically gone'.

Oliver Sourbut's avatar

Separately, but relatedly, this point about research taste (and what constitutes research) more generally has bearing on what capabilities (and over what timeframe) you should expect your later super AIs to be able to develop. For domains where experimental throughput is slow or hard to scale, it's hard to learn taste (regardless of how stacked your IQ is) and therefore hard to make (unusually fast) progress.

For me, one of the best ways to quantify the superness of an AI is something like experimental nous and sample efficiency at gaining domain-relevant research taste. But then you have to account for the remaining bottlenecks to making research progress *besides* taste, namely apparatus, materials, construct validity, sim2real, ..., which vary over domains.

Jeffrey Soreff's avatar

Many Thanks for the update! Happy New Year to you all!

( I'm hopeful that substantial progress on incremental learning might help on _many_ fronts:

decreasing remaining hallucinations, improving agency, improving coding, improving rules of thumb that add up to "taste". "I won't make that mistake again" can potentially go a looong way... )

Spreadlove5683's avatar

I would love it if we could get an all things considered graph of artificial super intelligence timelines that bakes in the automated coder probability distribution. Ie it gives you an absolute probability distribution of ASI by certain dates without you having to know anything about the automated coder probability distribution.

Brendan Halstead's avatar

We recently added this to our website! Daniel's all-things-considered ASI distribution is here: https://www.aifuturesmodel.com/forecast/daniel-01-26-26?timeline=ASI&show=atc

Spreadlove5683's avatar

Oh, awesome! Thanks!!

Dirk Friedrich's avatar

Thank you —this is one of the clearest capability-anchored timeline updates to date.

We arrive at a similar phase ordering but model the transition mechanics differently, explicitly separating

(i) autonomous self-improvement thresholds from

(ii) compounding emergence and only later

(iii) Kurzweil-level ASI.

I’ll add a short technical paper and an explanatory slide deck.

https://bit.ly/ASI_Transition

https://bit.ly/HorizonParadox

Looking forward to further cross-validation.

Chris's avatar

A great piece of work! As an obsessed amateur I have seen incredible progress over the last 4 months. Using vscode/GitHub copilot I have produced 2 enterprise level apps with basic coding skills and a little website CSS experience. In August, the models would often trash my code, but they now all work with 95% reliability and I would say I could produce similar apps 5 times faster now. Has reached the point where I no longer need to interact with the code editor at all...I'd say there's no longer any need for the models to be coding in high level languages..let them create a low level language that better suits progress!

Looking at the bigger picture, I have 3 thoughts.

1/. I believe the majority of real world implementation will be much slower and will take a generation to create any widespread implementation. Right now, the models could replace most junior writing/coding/analysis jobs, but I don't really see it happening. The companies that most of us deal with on a daily basis are not doing much yet at all. Having worked in change management across various industries, I just can't see ai being implemented on such rapid timescales with humans still in the loop. It's not such much resisting ai/change, but simply being happy enough with things as they are and not being particularly driven to improve. Beyond producing daft photos, most people will need whatever the next major device will be (I don't think glasses is going to do it!) before ai truly changes their life. There may be some major barriers to agentic ai that stall progress for a year or two...e.g. payments without a human in the loop.

2/. Alot of ai thinking centres on the 'first to achieve agi/asi winning everything'. I think there will be 3-5 companies achieving agi at the same sort of time, with varying strengths. As they deploy widely and independently/beyond human understanding, towards asi, I think they will monitor each other's alignment and will be competitive to some extent (may even slow overall progress).

Think about the affects of competition right now and how they are slowing progress...I can't easily take my personal context from one ai model to another. I can't connect all my data together or even get hold of most of it in a useful way. For ai to achieve the sort of global influence (misaligned/subversive) many talkl about, it's going to need to access all our data. At the moment I can't even get my history from Google maps, or a list of everything I've watched on Netflix! GDPR is going to give us live access to all our online data before ai personal assistants become truly useful (banks, shops, medical records etc) I can't see hundreds of competing companies opening up live api's to their customers without putting up a big fight. It's simply not in their interest to enable customers, let alone their ai assistants, to easily compare them with competitors. Markets are not as free as we might think. Governments (democratic ones at least!) are unlikely to have any significant regulation influence in timescales shorter than 2-5 years.

3/. If/when alignment with humans becomes an issue, I suspect that ai infrastructure will be very vulnerable...it's going to rely on power supply and fibre optic cables in the ground/under the sea. A significant effort from a world power/highly organised and well resourced group might disrupt and delay progress or even be able to take down a key competitor.

I very much believe we have already created the embryo for the next species to take over the earth/universe, and am sure progress will be rapid. However, I like to believe that we will ensure reasonable alignment over the next 20 years or so. I think history shows that truly global implementation needs the shifted mindset of the following generation.

Anatol Wegner, PhD's avatar

It is quite hilarious to call this an (improved!) model. This is just tea leaf reading using plots of made-up scaling laws in a desperate attempt to keep the AGI hype alive.

The whole foundation is the METR timescale paper, which is little more than a corporate data science story built on cherry-picked data (see my review: https://aichats.substack.com/p/are-ai-time-horizon-doubling-every?r=4tn68o ).

Not to mention the claim that current models can somehow magically be scaled into 'Automated Researchers,' which will then accelerate AI into ASI cloud-cuckoo-land. It’s a scifi fantasy masquerading as a forecast.

Malte's avatar

The assumption that economic incentives will naturally steer us away from catastrophe feels like the same market fundamentalism that gave us climate change. The real question is whether any decentralized system can handle coordination problems where local incentives point toward global disaster - we've never solved one of these before. What institutional designs could actually force coordination when every actor benefits from pushing further?

Spreadlove5683's avatar

I skimmed, so forgive if I'm missing something, but how does this cohere with the fact that many researchers at frontier labs say that AI is already writing 100% of their code?

David J Higgs's avatar

See the parts where they say that self-reports of AI productivity uplift are unreliable. And regardless, no one at the frontier labs is saying the AI does 100% of their job, or increases their overall coding productivity by 5x. It's a great tool, but the total uplift towards making better AI is clearly well under 1.5x, probably ~1.2x, maybe 1.1 or 1.3.

Spreadlove5683's avatar

Oh okay, thanks! I figured the distinction between an automated coder and a superhuman ai researcher in the timelines graph meant ai researchers would still be guiding the automated coder at the automated coder phase.