Skip to content
View junfanz1's full-sized avatar
🤗
Open to Work!
🤗
Open to Work!

Organizations

@DigitalFinanceAndWorldSIG-DAO

Block or report junfanz1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
junfanz1/README.md
Profile Views    GitHub Followers    GitHub Stars

Junfan Zhu 👋

LinkedIn X Email GitHub Instagram Facebook Douban Zhihu WeChat Resume

🤗 Founder & Principal Curator of Saturday Robotics — Silicon Valley’s high-signal Robotics & World Models community, connecting frontier researchers, founders, investors across embodied intelligence.

Physical AI researcher on World-Action Models, sim-to-real transfer, cross-embodiment policy learning.

Building evaluation-centric embodied AI systems spanning world models, agentic reasoning, real-world deployment.

Master’s in CS from Georgia Tech and Mathematics from UChicago, part-time studied at Stanford GSB. Previously, a Machine Learning Quant Researcher in Chicago.

A long-term thinker, resilient collaborator, and builder of high-impact AI systems.

X: https://x.com/junfanzhu98

Github (1.6k⭐️): https://github.com/junfanz1/

📄 Publications

  • Agents Last Exam (ALE): Benchmarking Long-Horizon AI Agents [NeurIPS 2026] · Contributed to large-scale AI eval infra at NeurIPS 2026, 1K+ task benchmark led by 300+ domain experts.

  • 🚗 IEDD: An Interactive Enhanced Driving Dataset for Autonomous Driving [Scientific Data 2026]
    🌲 As AutonomousDriving evolves toward VLA, sparse interactive scenarios and weak multimodal alignment remain critical bottlenecks. Existing datasets heavily bias toward straight-line cruising while severely under-representing long-tail interactive events (cut-in, merging, pedestrian crossing, head-on avoidance). IEDD introduces a physics-aware, interaction-dense dataset (plus IEDD-VQA multimodal extension) mined from 7.31M ego-centric scenes across Waymo, nuPlan, Lyft, INTERACTION, SIND — with 91% multi-agent interactions, dual Intensity–Efficiency metrics, pixel-level BEV-video alignment, rule-based hallucination-free language, and hierarchical L1–L4 VLM benchmarking. 🌍 It lays a scalable, causality-grounded foundation to evolve general-purpose VLMs into truly capable autonomous driving experts. 🤗 HuggingFace, LinkedIn, X.

  • 📊 QuantEval: A Benchmark for Financial Quantitative Tasks in Large Language Models [ACL 2026]
    🧪 Evaluation and domain knowledge are the core bottlenecks of Quant + AI. Without expert-level, strong verifiers for evaluation, models cannot reliably assess performance in multi-step strategy generation, risk control, or real-world trading effectiveness. QuantEval is proposed in this context, providing a reproducible benchmark framework that goes beyond static question answering and shifts toward evaluation grounded in realistic trading details. It represents an initial exploration of evaluating financial “World Models.” 🌍

🏆 Awards

Professional Services

  • Invited Reviewer, ACM Conf (AI Agentic Systems), 2026. Nominated by committee for research contributions.
  • Program-Committee-Equivalent Curator, Saturday Robotics Reading Club—top Bay Area robotics ecosystem.

🚀 AI Engineering Portfolio

My portfolio boasts pioneering projects in MoE & Attention for scalable LLM, reflective multi-agent orchestrations, and full-stack GenAI applications.

Favorite project integrating Generative AI, Humanoid Robotics (RLHF), and Low-Altitude Economy.

🛠️ Tech Stack

Python PyTorch NumPy Pandas Scikit-learn LangChain LangGraph Pydantic CUDA R MATLAB Java C++ JavaScript Solidity Django Flask Node.js SQLite PostgreSQL MySQL MongoDB Redis React HTML5 CSS3 Docker Kubernetes AWS Azure Linux Postman Git Vercel

🌏 Fun Facts

📊 GitHub Stats

Junfan Zhu's GitHub Stats Top Languages GitHub Streak Total Contributions https://github.com/junfanz1

Contribution Heatmap

Pinned Loading

  1. Awesome-AI-Review Awesome-AI-Review Public

    Awesome AI industry & research review

    608 112

  2. GRPO GRPO Public

    Search-R1 fine-tunes LLMs to decide when to search and when to answer using reinforcement learning over multi-step trajectories. It employs Group Relative Policy Optimization (GRPO) for stable toke…

    Python 3

  3. MCP-MultiServer-Interoperable-Agent2Agent-LangGraph-AI-System MCP-MultiServer-Interoperable-Agent2Agent-LangGraph-AI-System Public

    This project demonstrates a decoupled real-time agent architecture that connects LangGraph agents to remote tools served by custom MCP (Modular Command Protocol) servers. The architecture enables a…

    Python 26 5

  4. MoE-Mixture-of-Experts-in-PyTorch MoE-Mixture-of-Experts-in-PyTorch Public

    Implementations of a Mixture-of-Experts (MoE) architecture designed for research on large language models (LLMs) and scalable neural network designs. One implementation targets a **single-device/NP…

    Python 71 7

  5. LangGraph-Reflection-Researcher LangGraph-Reflection-Researcher Public

    The LangGraph project implements a "Reflection Agent" designed to iteratively refine answers to user queries using a Large Language Model (LLM) and web search. It simulates a research process where…

    Jupyter Notebook 6

  6. MiniGPT-and-DeepSeek-MLA-Multi-Head-Latent-Attention MiniGPT-and-DeepSeek-MLA-Multi-Head-Latent-Attention Public

    An efficient and scalable attention module designed to reduce memory usage and improve inference speed in large language models. Designed and implemented the Multi-Head Latent Attention (MLA) modul…

    Python 22 2