You're viewing demo data. Sign in and upload your LinkedIn connections to see jobs where you know someone.
← Back

Senior LLM Train Framework Engineer

NVIDIA · China, Shanghai · workday

Apply →
Score: 69 S:10 R:20 D:30 C:9
First seen: 2026-01-29 · Last seen: 2026-02-10

Your contacts at NVIDIA (5)

Demo Contact Your Connection
Demo Contact Your Connection
Demo Contact Your Connection
Demo Contact Your Connection
Demo Contact Your Connection

Why You're a Fit

LLMs & Generative AI (PyTorch, CUDA, GPU inference)
Technical skills
"Prior experience with Generative AI techniques applied to LLM and Multi-Modal learning (Text, Image, and Video)"
Hands-on with LLMs, PyTorch, CUDA, GPU inference
Senior Director, Generative AI, Teradata (2023-2025)
"Address extensive AI training and inference obstacles, covering the entire model lifecycle including orchestration, data pre-processing, conducting mo..."
Cloud platforms (AWS, Azure, hybrid)
Technical skills
", complete pipelines for AI training and inference on CSPs like AWS, Azure, GCP, or OCI)"
Led Generative AI Architecture
Senior Director, Generative AI, Teradata (2023-2025)
"Spearhead advancements in model architectures, distributed training strategies, and model parallel approaches"
StableDiffusion, Flux, image generation models
Senior Director, Generative AI, Teradata (2023-2025)
"...lt for researchers and developers working on Large Language Models (LLM) and Multimodal (MM) foundation model pretraining and post-training..."
Saved $150M OPEX through automation initiatives
Track record
"Consistent record of working effectively across multiple engineering initiatives and improving AI libraries with new innovations"
Led Generative AI Architecture at enterprise scale
Track record
"Spearhead advancements in model architectures, distributed training strategies, and model parallel approaches"

Job Description

NVIDIA is now looking for LLM Train Framework Engineers for the Megatron Core team. Megatron Core is open-source, scalable, and cloud-native frameworks built for researchers and developers working on Large Language Models (LLM) and Multimodal (MM) foundation model pretraining and post-training. Our GenAI Frameworks provide end-to-end model training, including pretraining, alignment, customization, evaluation, deployment, and tooling to optimize performance and user experience. Build on Megatron Core Framework's capabilities by inventing advanced distributed training algorithms and model optimizations. Collaborate with partners to implement optimized solutions.

What you’ll be doing:

  • Build and develop open source Megatron Core.

  • Address extensive AI training and inference obstacles, covering the entire model lifecycle including orchestration, data pre-processing, conducting model training and tuning, and deploying models.

  • Work at the intersection of AI applications, libraries, frameworks, and the entire software stack.

  • Spearhead advancements in model architectures, distributed training strategies, and model parallel approaches.

  • Enhance the pace of foundation model training and optimization through mixed precision formulas and advanced NVIDIA GPU structures.

  • Performance tuning and optimizations of deep learning framework and software components.

  • Research, prototype, and develop robust and scalable AI tools and pipelines.

What we need to see:

  • MS, PhD or equivalent experience in Computer Science, AI, Applied Math, or related fields and 5+ years of industry experience.

  • Experience with AI train frameworks (e.g., PyTorch, JAX), and/or inference and deployment environments (e.g., TRTLLM, vLLM, SGLang).

  • Proficiency in decentralized instruction.

  • Proficient in Python programming, software development, debugging, performance analysis, test composition, and documentation.

  • CUDA or collective programming skills are a big plus.

  • Consistent record of working effectively across multiple engineering initiatives and improving AI libraries with new innovations.

  • Strong understanding of AI/Deep-Learning fundamentals and their practical applications.

Ways to stand out from the crowd:

  • Proficient in large-scale AI training, knowledgeable in compute system concepts like latency and efficiency.

  • Expertise in distributed computing, model parallelism, and mixed precision training.

  • Prior experience with Generative AI techniques applied to LLM and Multi-Modal learning (Text, Image, and Video).

  • Knowledge of GPU/CPU architecture and related numerical software.

  • Familiarity with cloud computing (e.g., complete pipelines for AI training and inference on CSPs like AWS, Azure, GCP, or OCI).

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working with us. If you're creative and autonomous, we want to hear from you! NVIDIA encourages diversity and is an equal opportunity employer, valuing all characteristics.

Apply for this role →