Amazon and Cerebras Systems have announced a partnership to combine their AI chips within Amazon Web Services data centers to accelerate AI services like chatbots. The deal will see Amazon's Trainium3 chips handle the initial "prefill" stage of AI inference, while Cerebras chips manage the final "decode" stage, a strategy described as "divide and conquer."
This collaboration is a direct competitive move against Nvidia, with Amazon positioning its upcoming service as a better value. The service is expected to become available in the second half of this year.
The main topics covered are a strategic partnership between Amazon and Cerebras, the technical division of AI inference workloads, and the competitive landscape with Nvidia.
Amazon.com and Cerebras Systems on Friday said they have reached a deal to combine the two companies' computing chips in a new service aimed at speeding up chatbots, coding tools and other artificial intelligence services.
Valued at $23.1 billion, Cerebras is a chip startup aiming to take on Nvidia by building a fundamentally different kind of AI chip that does not rely on expensive high-bandwidth memory âas Nvidia's flagship â chips â do. Earlier this year, Cerebras signed a $10 billion deal to supply chips to ChatGPT creator OpenAI.
Under the deal announced Friday, Cerebras chips will sit inside Amazon Web Services (AWS) data centers and be linked to Amazon's own Trainium3 custom AI chips, connected with custom networking technology from Amazon.
"Every customer large or small is on AWS, from individual developers to the largest banks in the world," Cerebras CEO âAndrew Feldman told Reuters, saying the deal will "make it easy â as a âclick to get on Cerebras."
Both companies declined to disclose the size âof the deal.
Amazon âand Cerebras will team up to tackle what is known as "inference," â where previously trained AI systems take requests from users and spit âout answers. The two companies will split up that task âinto two steps, one called "prefill" where the user's request is transformed from human words into the language of "tokens" that AI computers use, and a "decode" stage where the AI computer provides the answer the user is looking for.
Amazon said its Trainium3 chips will handle prefill, while Cerebras chips handle decoding, what Feldman told Reuters is a "divide and conquer strategy."
It is a similar âstrategy to the one that analysts expect Nvidia to unveil next week, when it details how it plans to combine its own graphics processing âunit (GPU) chips with âthose from Groq, a startup â it spent $17 billion on in late December. In a statement, Amazon said that it could not yet make a detailed comparison between its offering, which will come online in the âsecond half of this year, and Nvidia's as-yet-unrevealed offering, but Amazon expects its service to be a better value.
"The timeline for that (Nvidia-Groq) pairing remains unclear while our Trainium3 program is just months away from running production workloads," Amazon said in response to Reuters questions. "What we can say is that we believe (Trainium3)-and future (Trainium4)-will continue to lead in price-performance versus merchant GPUs."
Valued at $23.1 billion, Cerebras is a chip startup aiming to take on Nvidia by building a fundamentally different kind of AI chip that does not rely on expensive high-bandwidth memory âas Nvidia's flagship â chips â do. Earlier this year, Cerebras signed a $10 billion deal to supply chips to ChatGPT creator OpenAI.
Under the deal announced Friday, Cerebras chips will sit inside Amazon Web Services (AWS) data centers and be linked to Amazon's own Trainium3 custom AI chips, connected with custom networking technology from Amazon.
"Every customer large or small is on AWS, from individual developers to the largest banks in the world," Cerebras CEO âAndrew Feldman told Reuters, saying the deal will "make it easy â as a âclick to get on Cerebras."
Both companies declined to disclose the size âof the deal.
Amazon âand Cerebras will team up to tackle what is known as "inference," â where previously trained AI systems take requests from users and spit âout answers. The two companies will split up that task âinto two steps, one called "prefill" where the user's request is transformed from human words into the language of "tokens" that AI computers use, and a "decode" stage where the AI computer provides the answer the user is looking for.
Amazon said its Trainium3 chips will handle prefill, while Cerebras chips handle decoding, what Feldman told Reuters is a "divide and conquer strategy."
It is a similar âstrategy to the one that analysts expect Nvidia to unveil next week, when it details how it plans to combine its own graphics processing âunit (GPU) chips with âthose from Groq, a startup â it spent $17 billion on in late December. In a statement, Amazon said that it could not yet make a detailed comparison between its offering, which will come online in the âsecond half of this year, and Nvidia's as-yet-unrevealed offering, but Amazon expects its service to be a better value.
"The timeline for that (Nvidia-Groq) pairing remains unclear while our Trainium3 program is just months away from running production workloads," Amazon said in response to Reuters questions. "What we can say is that we believe (Trainium3)-and future (Trainium4)-will continue to lead in price-performance versus merchant GPUs."