• IEEE.org
  • IEEE CS Standards
  • Career Center
  • About Us
  • Subscribe to Newsletter

0

IEEE-CS_LogoTM-orange
  • MEMBERSHIP
  • CONFERENCES
  • PUBLICATIONS
  • EDUCATION & CAREER
  • VOLUNTEER
  • ABOUT
  • Join Us
IEEE-CS_LogoTM-orange

0

IEEE Computer Society Logo
Sign up for our newsletter
IEEE COMPUTER SOCIETY
About UsBoard of GovernorsNewslettersPress RoomIEEE Support CenterContact Us
COMPUTING RESOURCES
Career CenterCourses & CertificationsWebinarsPodcastsTech NewsMembership
BUSINESS SOLUTIONS
Corporate PartnershipsConference Sponsorships & ExhibitsAdvertisingRecruitingDigital Library Institutional Subscriptions
DIGITAL LIBRARY
MagazinesJournalsConference ProceedingsVideo LibraryLibrarian Resources
COMMUNITY RESOURCES
GovernanceConference OrganizersAuthorsChaptersCommunities
POLICIES
PrivacyAccessibility StatementIEEE Nondiscrimination PolicyIEEE Ethics ReportingXML Sitemap

Copyright 2026 IEEE - All rights reserved. A public charity, IEEE is the world’s largest technical professional organization dedicated to advancing technology for the benefit of humanity.

  • Home
  • /Publications
  • /Tech News
  • /Community Voices
  • Home
  • / ...
  • /Tech News
  • /Community Voices

Mapping the $85B AI Processor Landscape: Global Startup Surge, Market Consolidation Coming?

By Dr. Jon Peddie on
November 3, 2025

AI processor (AIP) startups more than doubled from 2018, and now there are 138 companies building dedicated AI silicon across 18 countries. The market is worth $85B today and driven mainly by inference (cloud & local) and edge deployments (wearables to PCs).

Of the 138 AIP suppliers, 64% are privately held, and most were founded within the last 7 years. As might be expected, the companies are focused on the cloud/local inference and edge; training remains capital-intensive.

However, the startup wave has crested: The peak formation year was 2018 (by then 54% of startups had already appeared). Interesting, the rise in the number of startups began before Nvidia stunned the industry with its explosion in sales. One would think that Nvidia’s success was the honey that attracted all the ants, but a good number (54%) of the startup companies got going before that. Since 2022, the sector has averaged three acquisitions per year.

An AIP is a chip optimized to run neural-network workloads fast and efficiently by doing huge amounts of tensor math while moving data as little as possible. The processor types of offerings span GPUs, NPUs, CIM/PIM, neuromorphic processors, and matrix/tensor engines. (CPUs and FPGAs are excluded from market totals due to functional generality. Common patterns include tensor/matrix engines, near-compute SRAM + HBM/DDR, NoC fabric, and PCIe/CXL/NVLink/Ethernet off-chip links.

The combination of LLM inference at scale, edge AI proliferation, and memory-bound workloads is reshaping silicon roadmaps.

Figure 1. Population of AIP suppliers Source: Jon Peddie Research

Of the five major market segments, Inference (cloud and local) and Edge (wearables to PCs) are the areas where the companies are putting most of their efforts.

Figure 2 Market segments by companies Source; Jon Peddie Research

Of the privately held companies, has the window closed? The peak in startups happened in 2018, five years before Nvidia’s sales exploded. And by then, 54% of the startups appeared.

Figure 3. The rate of AI processor startups being founded Source: Jon Peddie Research

Since 2022, there has been an average of three acquisitions every year.

AIPs are being offered as GPUs, NPU, CIMs, Neuromorphic processors, CPUs, and even FPGAs. Our report does not include CPUs and FPGRAs in its evaluation of the market because their generality makes them impossible to differentiate by function.

Figure 4. AIP distribution Source: Jon Peddie Research

What an AIP looks like inside

  • Compute blocks: wide SIMD/SIMT cores (GPU style), tensor/matrix engines (systolic arrays), vector units, activation units.
  • Memory hierarchy: small, fast SRAM near compute; larger HBM/DDR off-chip; caches or scratchpads; prefetchers/DMA.
  • Interconnects: on-chip NoC; off-chip links (PCIe/CXL/NVLink/Ethernet).
  • Control: command processors, schedulers, and microcode for kernels/collectives.

The world of AI processors spans cloud services, data center chips, embedded IP, and neuromorphic hardware. Founders and engineers address the gaps that CPUs and GPUs can't fill: managing memory, maintaining high utilization with small batches, meeting latency goals on strict power budgets, and providing consistent throughput at scale. Companies develop products along two main dimensions: the type of workload—training, inference, or sensor-level signal processing—and the deployment tier, from hyperscale data centers to battery-powered devices.

Most technical work centers on memory and execution control. Compute-in-memory and analog techniques reduce data transfers by performing calculations within memory arrays and keeping partial sums nearby. Wafer-scale chips store activations in local SRAM and stream weights for long sequences. Reconfigurable fabrics alter data flow and tiling during compile time to optimize utilization across multiple layers. Training chips emphasize interconnect bandwidth and collective communication, while inference chips prioritize batch-one latency, key-value caching for transformers, and power efficiency at the edge.

Figure 5. A typical AIP Source: Jon Peddie Research

Adoption depends on go-to-market strategies and ecosystem backing. Cloud providers incorporate accelerators into managed services and model-serving frameworks. IP vendors collaborate with handset, auto, and industrial SoC teams, offering toolchains, models, and density roadmaps. Edge specialists release SDKs that compress models, quantize to INT8 or lower, and map operators onto sparse or analog units while achieving accuracy targets. Neuromorphic groups publish compilers for spiking networks and emphasize energy efficiency and latency on event streams. Refinements in compilers, kernel sets, and observability tools often outweigh peak TOPS.

Competition varies by tier. Training silicon focuses on cost per model trained considering network, memory, and compiler constraints. Inference silicon targets cost per token or frame within latency limits, using cache management and quantization as tools. Edge devices compete on milliwatts per inference and toolchain portability. IP vendors compete on tape-out time, PPA goals, and verification support. Research projects balance speed to market against experiments that may alter the trade-offs between memory, compute, and communication.

Throughout this process, teams customize designs to meet specific needs, such as attention depth, parameter count, activation size, sparsity, and precision policies. When companies synchronize silicon, compiler, and deployment tools, they reduce integration costs and speed up the transition from models to high throughput. Customers then have multiple options: expand in the cloud, scale up with wafer-scale systems, embed NPUs in SoCs, or accelerate compute at sensors using analog and neuromorphic chips.

Disclaimer: The author is completely responsible for the content of this article. The opinions expressed are their own and do not represent IEEE's position nor that of the Computer Society nor its Leadership.

LATEST NEWS
Computing’s Top 30: Li Yang
Computing’s Top 30: Li Yang
Women in STEM Workshop and CodeFest in Bhutan: Empowering the Next Generation of Female Technologists
Women in STEM Workshop and CodeFest in Bhutan: Empowering the Next Generation of Female Technologists
Automating Compliance in Life Sciences for Real-Time Audit Readiness
Automating Compliance in Life Sciences for Real-Time Audit Readiness
Computing’s Top 30: Rohan Basu Roy
Computing’s Top 30: Rohan Basu Roy
Episode 3 | How IEEE Can Support and Enhance Academia
Episode 3 | How IEEE Can Support and Enhance Academia
Read Next

Computing’s Top 30: Li Yang

Women in STEM Workshop and CodeFest in Bhutan: Empowering the Next Generation of Female Technologists

Automating Compliance in Life Sciences for Real-Time Audit Readiness

Computing’s Top 30: Rohan Basu Roy

Episode 3 | How IEEE Can Support and Enhance Academia

Behind the Scenes: How SC Volunteers Power One of the World’s Fastest Growing Conferences and Trade Show

Computing’s Top 30: Bo Han

From Clicks to Conversations: How HCI Is Evolving in an AI-First World

Get the latest news and technology trends for computing professionals with ComputingEdge
Sign up for our newsletter