Claritypoint AI
No Result
View All Result
  • Login
  • Tech

    Biotech leaders: Macroeconomics, US policy shifts making M&A harder

    Funding crisis looms for European med tech

    Sila opens US factory to make silicon anodes for energy-dense EV batteries

    Telo raises $20 million to build tiny electric trucks for cities

    Do startups still need Silicon Valley? Leaders at SignalFire, Lago, and Revolution debate at TechCrunch Disrupt 2025

    OmniCore EyeMotion lets robots adapt to complex environments in real time, says ABB

    Auterion raises $130M to build drone swarms for defense

    Tim Chen has quietly become of one the most sought-after solo investors

    TechCrunch Disrupt 2025 ticket rates increase after just 4 days

    Trending Tags

  • AI News
  • Science
  • Security
  • Generative
  • Entertainment
  • Lifestyle
PRICING
SUBSCRIBE
  • Tech

    Biotech leaders: Macroeconomics, US policy shifts making M&A harder

    Funding crisis looms for European med tech

    Sila opens US factory to make silicon anodes for energy-dense EV batteries

    Telo raises $20 million to build tiny electric trucks for cities

    Do startups still need Silicon Valley? Leaders at SignalFire, Lago, and Revolution debate at TechCrunch Disrupt 2025

    OmniCore EyeMotion lets robots adapt to complex environments in real time, says ABB

    Auterion raises $130M to build drone swarms for defense

    Tim Chen has quietly become of one the most sought-after solo investors

    TechCrunch Disrupt 2025 ticket rates increase after just 4 days

    Trending Tags

  • AI News
  • Science
  • Security
  • Generative
  • Entertainment
  • Lifestyle
No Result
View All Result
Claritypoint AI
No Result
View All Result
Home Tech

VCs are still hiring MBAs, but firms are starting to need other experience more

Chase by Chase
September 25, 2025
Reading Time: 3 mins read
0

# Beyond the Hype: Rethinking Emergent Abilities in LLMs

RELATED POSTS

Biotech leaders: Macroeconomics, US policy shifts making M&A harder

Funding crisis looms for European med tech

Sila opens US factory to make silicon anodes for energy-dense EV batteries

One of the most captivating narratives in the world of Large Language Models (LLMs) has been the concept of **emergent abilities**. These are the seemingly magical capabilities—like multi-step arithmetic, chain-of-thought reasoning, or code generation—that appear to suddenly switch on when a model crosses a certain size threshold. For years, the prevailing wisdom was that scaling up models didn’t just lead to incremental improvements; it triggered a **phase transition**, unlocking entirely new, unpredictable skills.

This idea has been a powerful driver of the race to build ever-larger models. The logic was simple: if we just add more parameters and more data, who knows what new abilities might emerge next? However, a growing body of research is now challenging this foundational belief, suggesting that what we’ve been calling “emergence” might be less about a true leap in a model’s latent capabilities and more about a mirage created by our own evaluation methods.

—

### The Alluring Idea of Emergence

First, let’s be clear about what made the emergence hypothesis so compelling. When researchers plotted model performance against scale (e.g., number of parameters) on specific complex tasks, the graphs often showed a striking pattern. Performance would hover near zero for smaller models, and then, at a critical scale, it would shoot up dramatically, far exceeding random chance.

This wasn’t a smooth, linear progression. It looked like a switch being flipped. This observation led to the exciting conclusion that quantitative increases in scale could produce qualitative leaps in intelligence. It painted a picture of AI development as a process of discovery, where we build larger models to see what new, surprising competencies they possess.

ADVERTISEMENT

### The Mirage in the Metrics

The new perspective argues that this sharp “emergence” curve is an artifact of **non-linear metrics**. Many of our benchmarks, particularly those designed to test complex reasoning, rely on a binary “correct” or “incorrect” evaluation. A model either gets the final answer to a multi-step math problem right, or it gets it wrong. There’s no partial credit.

Consider this analogy: imagine testing students on a complex physics problem. A student’s underlying understanding of physics might be improving gradually and linearly as they study. However, on a pass/fail exam, their score remains at 0% until their understanding crosses the specific threshold needed to solve that one problem correctly, at which point their score jumps to 100%. From the metric’s perspective, the ability “emerged” suddenly. But in reality, the student’s competence was growing all along.

This is what researchers now believe is happening with LLMs. As a model scales, its ability to assign a higher probability to the correct sequence of tokens (the “thought process”) improves smoothly and predictably. For a long time, this improvement isn’t enough to get the final answer right consistently. But once the model’s internal probability for the correct answer crosses a critical threshold, our non-linear, all-or-nothing benchmark suddenly registers a sharp spike in performance. The ability was always developing; our tools were just not sensitive enough to measure it until it became overwhelmingly obvious.

By switching to metrics that grant partial credit or measure the model’s per-token probability of generating the correct answer, recent studies show a much smoother, more predictable improvement curve as models scale. The sharp, “magical” jump disappears, replaced by a steady, linear progression.

### From Magic to Methodical Engineering

So, does this mean scaling is a dead end? Absolutely not. In fact, this new understanding is arguably better news for the field of AI engineering.

If emergent abilities were truly unpredictable, then building next-generation models would be a high-stakes gamble. We would be pouring billions of dollars into scaling efforts with no real guarantee of what capabilities might—or might not—materialize.

The “metric mirage” hypothesis replaces this alchemy with a more robust science. It suggests that the benefits of scale are **predictable and reliable**. We can be more confident that a larger, better-trained model will be incrementally better at a wide range of tasks. This shifts our focus from “hoping for magic” to methodical engineering. The challenge is no longer about blindly scaling and hoping for a breakthrough. Instead, it becomes about:

1. **Developing better evaluation frameworks:** We need more nuanced, continuous metrics that can accurately track a model’s latent capabilities as they develop.
2. **Improving architectural and training efficiency:** If progress is predictable, then every gain in efficiency directly translates to more capability for a given amount of compute.

—

### Conclusion

The narrative of emergent abilities was a powerful and inspiring chapter in the story of AI. While the “magic” of sudden, unpredictable leaps may have been an illusion, the reality is far more empowering for building reliable AI systems. The progress we’re seeing is not a series of happy accidents but the result of predictable improvements driven by scale. By understanding the mirage in our metrics, we can move forward with a clearer, more engineering-driven discipline, focusing on building the rigorous tools and techniques necessary to measure and guide the steady, remarkable ascent of AI capability.

This post is based on the original article at https://techcrunch.com/2025/09/21/vcs-are-still-hiring-mbas-but-firms-are-starting-to-need-other-experience-more/.

Share219Tweet137Pin49
Chase

Chase

Related Posts

Tech

Biotech leaders: Macroeconomics, US policy shifts making M&A harder

September 26, 2025
Tech

Funding crisis looms for European med tech

September 26, 2025
Tech

Sila opens US factory to make silicon anodes for energy-dense EV batteries

September 25, 2025
Tech

Telo raises $20 million to build tiny electric trucks for cities

September 25, 2025
Tech

Do startups still need Silicon Valley? Leaders at SignalFire, Lago, and Revolution debate at TechCrunch Disrupt 2025

September 25, 2025
Tech

OmniCore EyeMotion lets robots adapt to complex environments in real time, says ABB

September 25, 2025
Next Post

Powered by India’s small businesses, UK fintech Tide becomes a TPG-backed unicorn

This medical startup uses LLMs to run appointments and make diagnoses

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended Stories

The Download: Google’s AI energy expenditure, and handing over DNA data to the police

September 7, 2025

Appointments and advancements for August 28, 2025

September 7, 2025

Ronovo Surgical’s Carina robot gains $67M boost, J&J collaboration

September 7, 2025

Popular Stories

  • Ronovo Surgical’s Carina robot gains $67M boost, J&J collaboration

    548 shares
    Share 219 Tweet 137
  • Awake’s new app requires heavy sleepers to complete tasks in order to turn off the alarm

    547 shares
    Share 219 Tweet 137
  • Appointments and advancements for August 28, 2025

    547 shares
    Share 219 Tweet 137
  • Medtronic expects Hugo robotic system to drive growth

    547 shares
    Share 219 Tweet 137
  • D-ID acquires Berlin-based video startup Simpleshow

    547 shares
    Share 219 Tweet 137
  • Home
Email Us: service@claritypoint.ai

© 2025 LLC - Premium Ai magazineJegtheme.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Subscription
  • Category
  • Landing Page
  • Buy JNews
  • Support Forum
  • Pre-sale Question
  • Contact Us

© 2025 LLC - Premium Ai magazineJegtheme.

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?