xAI發表Grok-1.5
· 2024-04-01

xAI強調新版Grok-1.5模型在許多基準測試表現上,能夠與OpenAI頭號勁敵、AI新創Anthropic所打造的Claude模型相抗衡

xAI

繼於3月17日開源大型語言模型Grok-1之後,由馬斯克(Elon Musk)創立的xAI再於3月28日發表Grok-1.5,新版本將支援16倍的脈絡長度,可望於本周釋出,也會成為X上Grok聊天機器人的底層模型。

儘管離釋出Grok-1不到兩周,但xAI宣稱Grok-1.5有顯著的進展,在許多基準測試上直追或超越Claude 3 Sonnet及Claude 2。例如Grok-1.5在MMLU(大規模多工語言理解)基準測試的成績為81.3%,超越Claude 2的75%與Claude 3 Sonnet的79%;在MATH數學基準測試上的成續則是50.6%,也超越Claude 3 Sonnet的40.5%;GSM8K基礎數學的成績為90%,超越Claude 2的88%,逼近Claude 3 Sonnet的92.3%;HumanEval程式碼基準測試的成績為74.1%,凌駕Claude 2的70%與Claude 3 Sonnet的73%。

有趣的是,有別於各個大型語言模型的基準測試都會拿OpenAI的GPT來作比較,日前才控告OpenAI的馬斯克或許是刻意略過GPT,而選擇了近來被視為最有機會挑戰GPT、由Anthropic所打造的Claude。不過,xAI用來比較的版本是Anthropic在2023年7月推出的Claude 2.0,以及Claude 3.0的中階版本Claude 3 Sonnet,而非最高階的Claude 3 Opus。

此外,Grok-1.5不僅於上述基準測試中明顯勝過Grok-1.0,新版也支援128K個Token的脈絡,使其記憶能力達到舊版的16倍,而更擅長處理長文件。

xAI說明,Grok-1.5是在一個基於JAX、Rust與Kubernetes的客製化分散式訓練框架上所建置,此一訓練堆疊使其團隊得以花費最小的力氣來測試原型想法,同時大規模訓練新架構,其客製化的訓練協調器可自動偵測到有問題的節點並將其從訓練任務中剔除,該團隊也優化了檢查點、資料載入及訓練任務的重新啟動,以最小化發生故障的停機時間。

此外,馬斯克還透過X放話,正在訓練中的Grok-2.0將會在所有的基準測試上超越現有的AI,不過,他並未揭露Grok-2.0的上線時間點。

Popular articles
Across 6 Cities: HUIDU Invites You to 8 World Cup Parties Redefining High-Value Social Networking
HUIDU Focus
1spin4win releases unique slot Don Catleone Hold and Win featuring gangster cats
Online Game
GGC Awards 2026 Shines in Colombo: Honoring Leaders and Innovators in the iGaming Industry
HUIDU Focus
GAT CDMX 2025 Institutional Academy: Leaders and Experts Analyze the Present and Future of the Gaming Industry in Mexico and Lat
Sports Game
GAT Expo Puerto Rico Will Pulse with the New Era of Gaming in the Caribbean
Marketing
Kazakhstan plans to penalise online casino promotions
Regulation
B2B Tech Infrastructure Gains Momentum in Philippine Gaming Sector
Southeast Asia
JILI Partners with Cricket Legend AB de Villiers (ABD) to Launch Exclusive Branded Game Series 100% 11
Sports Game
Vietnam's tightening online gaming policy creates new market opportunities
Southeast Asia
HUIDU Invites You to Booth T70 at iGB L!VE 2026 — Let’s Ignite London This July!
HUIDU Focus
Super PAC Raises $48 Million: Sports Betting Forces Ramp Up Political Push
Regulation
Are you ready to maximize your earnings? Try ProPush.me Constructor!
Marketing
New Jersey July Gambling Revenue Hits $606M, Sweeps Casinos Banned
Regulation
SBC Summit Canada to Make Player Safety a Key Pillar of 2026 Agenda
Marketing
Institutional Academy that exceeded expectations marked the opening of GAT CDMX
Online Game
Home
Game
Cooperation
Find
My