Robots, Small Models, and RL with DeepSeek Alumnus Zihan Wang — Manifold #86

May 22, 2025

Zihan Wang is an AI researcher at Northwestern University, where he works on vision-language models, robotics, and reinforcement learning. Previously, he interned at DeepSeek, contributing to projects like DeepSeek-V2.

Zihan's homepage: https://zihanwang314.github.io/

(00:00) - Introduction
(01:13) - Zihan's Background, CS and AI Research in China
(11:09) - DeepSeek; Human capital flow from PRC to US
(16:07) - DeepSeek, Open Source and AI Research
(31:52) - Model Size and Performance Constraints
(33:01) - Data Bottleneck in Pre-trained Models
(34:12) - Transformer Architecture and Scaling Laws
(36:30) - Efficiency in Model Training
(47:44) - Chain of Experts Architecture
(01:01:06) - Future of AI and Robotics

Audio-only version and transcript:

https://www.manifold1.com/episodes/robots-small-models-and-rl-with-deepseek-alumnus-zihan-wang-86

Information Processing - Steve Hsu

Discussion about this post