Why Cloud Providers and LLM Providers Are Developing Custom Chips for Generative AI Workloads
As demand for generative AI grows, cloud service providers such as Microsoft, Google, and AWS, along with large language model (LLM) providers such as OpenAI, have all reportedly considered developing their own custom chips for AI workloads.
Speculation has been circulating that some of these companies, specifically OpenAI and Microsoft, are making efforts to develop their own custom chips due to chip shortages. The recent headlines have been dominated by this news. OpenAI is said to be considering acquiring a firm to further its chip-design plans, while Microsoft is reportedly collaborating with AMD to produce a custom chip named Athena.
Google and AWS have already developed their own chips for AI workloads. Google has its Tensor Processing Units (TPUs), while AWS has its Trainium and Inferentia chips.
The Cost and Efficiency Factors
So, why are these companies investing in custom chips? According to experts, the driving factors are the cost of processing generative AI queries and the efficiency of currently available chips, particularly graphics processing units (GPUs). Nvidia’s A100 and H100 GPUs currently dominate the AI chip market.
“GPUs are probably not the most efficient processor for generative AI workloads, and custom silicon might help their cause,” said Nina Turner, research manager at IDC.
Custom silicon can potentially reduce power consumption, improve compute interconnect or memory access, and ultimately lower the cost of queries. Turner suggests that companies like Microsoft and OpenAI could benefit from using custom silicon.
According to a report, OpenAI spends approximately $694,444 per day or 36 cents per query to operate ChatGPT. This high cost further incentivizes companies to find more efficient solutions for AI workloads.
Advantages of Custom Silicon
In addition to cost reduction, custom silicon offers other advantages. It provides control over access to chips and enables the design of elements specifically for LLMs to improve query speed.
Turner emphasizes that “AI workloads don’t exclusively require GPUs.” While GPUs excel in parallel processing, there are other architectures and accelerators better suited for AI-based operations.
The Apple Analogy
Experts draw parallels between the move to design custom silicon and Apple’s strategy of producing chips for its devices. Just as Apple switched from general-purpose processors to custom silicon to enhance device performance, generative AI service providers aim to specialize their chip architecture.
“If you really want to make things scream, you need a chip optimized for that particular function such as image processing or specialized generative AI,” explained Glenn O’Donnell, research director at Forrester. Custom chips could be the answer to achieving optimal performance in specific areas.
The Challenges of Developing Custom Chips
While the potential benefits of custom chips are clear, the development process poses significant challenges for companies:
- High investment
- Long design and development lifecycle
- Complex supply chain issues
- Talent scarcity
- Lack of understanding of the process
Gaurav Gupta, vice president and analyst at Gartner, highlights these impediments. According to him, companies starting from scratch may take at least two to two and a half years to develop custom chips. The scarcity of chip designing talent further contributes to delays.
Large technology companies often acquire startups or partner with companies that have expertise in chip design. For example, AWS acquired Israeli startup Annapurna Labs to develop custom chips, while Google collaborates with Broadcom for its AI chips.
Chip Shortage and the Motivation Behind Custom Chip Development
Regarding OpenAI’s rumored desire to acquire a startup for custom chip development, experts believe chip shortages might not be the primary concern. Instead, the focus could be on supporting inference workloads for LLMs.
“They have some requirement nobody is serving,” suggests Omdia principal analyst Alexander Harrowell. His opinion is based on CEO Sam Altman’s comments about the need to enhance GPT-4 rather than scaling it further. Inferencing an LLM requires more compute power compared to modeling a prediction.
Acquiring a large chip designer may not be financially viable for OpenAI. The cost of designing and producing chips could amount to approximately $100 million.
Nina Turner proposes an alternative approach. OpenAI could consider acquiring startups specializing in AI accelerators, such as Groq, Esperanto Technologies, Tenstorrent, and Neureality. SambaNova could also be a potential acquisition target if OpenAI decides to move away from Nvidia GPUs and adopt an on-premises approach.
Editor Notes – GPT News Room
The growing demand for generative AI has prompted cloud providers and LLM providers to explore the development of custom chips. This strategic move aims to optimize AI workloads, reduce costs, and enhance performance in specific areas. While there are challenges associated with developing custom chips, the potential benefits make it a worthwhile endeavor. OpenAI’s rumored plans to acquire a startup for chip development reflect their commitment to improving inference workloads and supporting the future growth of generative AI. To stay updated on the latest AI developments, visit GPT News Room.
Copyright © 2023 IDG Communications, Inc.
from GPT News Room https://ift.tt/EHmQw9R
No comments:
Post a Comment