/
Navigation
C
Chronicles
Browse all articles
C
E
Explore
Semantic exploration
E
R
Research
Entity momentum
R
N
Nexus
Correlations & relationships
N
~
Story Arc
Topic evolution
S
Drift Map
Semantic trajectory animation
D
P
Posts
Analysis & commentary
P
Browse
@
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
?
Concept Search
Semantic similarity search
!
High Impact Stories
Top coverage by position
+
Sentiment Analysis
Positive/negative coverage
*
Anomaly Detection
Unusual coverage patterns
Analysis
vs
Rivalry Report
Compare two entities head-to-head
/\
Semantic Pivots
Narrative discontinuities
!!
Crisis Response
Event recovery patterns
Connected
Nav: C E R N
Search: /
Command: ⌘K
Embeddings: large
VOICE ARCHIVE

Summer Yue

@summeryue0
5 posts
2024-05-30
🚀 Instruction Following - SEAL Leaderboards are out! IF winners: - GPT-4o and GPT-4 Turbo - Llama 3 70B Instruct - Mistral Large Gemini Pro 1.5 leaps into top 3 in preference rankings, and Claude rockets to #2 in factuality. See https://scale.com/... [image]
2024-05-30 View on X
SiliconANGLE

AI training data provider Scale AI releases SEAL Leaderboards, which uses private datasets to rank LLMs in domains like coding, instruction following, and math

🚀 Coding - The first expert evaluated SEAL Leaderboards are out! The coding race is neck and neck, winners: - GPT-4 Turbo and GPT-4o - Gemini Pro 1.5 - Claude 3 Opus See https://scale.com/... for details detailed analysis for each model! [image]
2024-05-30 View on X
SiliconANGLE

AI training data provider Scale AI releases SEAL Leaderboards, which uses private datasets to rank LLMs in domains like coding, instruction following, and math

🚀 Math - we released the GSM1k last month. Today, we augmented it with human ratings to account for chatty yet correct responses. Explore the GSM1k leaderboard as part of SEAL Leaderboards. We were glad to see LLMs have mostly nailed grade school math! [image]
2024-05-30 View on X
SiliconANGLE

AI training data provider Scale AI releases SEAL Leaderboards, which uses private datasets to rank LLMs in domains like coding, instruction following, and math

🚀 Spanish - The first expert evaluated SEAL Leaderboards are out! Spanish is our first multilingual leaderboard ( https://scale.com/...), winners: - GPT-4o - Gemini 1.5 Pro (post-I/O) - GPT-4 Turbo We plan to roll out more languages, which ones should we build next? [image]
2024-05-30 View on X
SiliconANGLE

AI training data provider Scale AI releases SEAL Leaderboards, which uses private datasets to rank LLMs in domains like coding, instruction following, and math

🚀 Introducing the SEAL Leaderboards! We rank LLMs using private datasets that can't be gamed. Vetted experts handle the ratings, and we share our methods in detail openly! Check out our leaderboards at https://scale.com/...! Which evals should we build next? [image]
2024-05-30 View on X
SiliconANGLE

AI training data provider Scale AI releases SEAL Leaderboards, which uses private datasets to rank LLMs in domains like coding, instruction following, and math