Senior AI Infrastructure Engineer

Engineering

Boston / On-Site

Deploy and optimize high-density GPU clusters (GB300). Manage power, cooling, and InfiniBand fabrics. Obsess over reliability.

Company Description

CambridgeNexus is an AI-native compute infrastructure company specializing in GPU-powered data centers. Our high-density, low-latency infrastructure is engineered to support modern machine learning, large-scale model training, and inference.

Role Description

Responsibilities include designing, monitoring, and maintaining high-performance GPU data center infrastructure. You will oversee troubleshooting of GPU systems, enhance network efficiency, implement network security solutions, and ensure the reliability and scalability of deployed systems.

What You’ll Own

GPU cluster deployment (GB300, NVLink, InfiniBand).
Power & cooling optimization (150kW+/rack).
Incident response & root-cause analysis.
Capacity planning and expansion.

Requirements

8+ years in data center / HPC / GPU infrastructure.
Hands-on with NVIDIA stack (CUDA, drivers, fabric). * Obsessed with reliability and performance

Company Description

Role Description

What You’ll Own

GPU cluster deployment (GB300, NVLink, InfiniBand).
Power & cooling optimization (150kW+/rack).
Incident response & root-cause analysis.
Capacity planning and expansion.

Requirements

8+ years in data center / HPC / GPU infrastructure.
Hands-on with NVIDIA stack (CUDA, drivers, fabric). * Obsessed with reliability and performance

SIGNAL: OUTLIER

We are constantly scanning for 10x engineers. If you don't fit a standard role description but can optimize GB300 clusters or architect low-latency fabrics, initiate contact immediately.

SIGNAL: OUTLIER

We are constantly scanning for 10x engineers. If you don't fit a standard role description but can optimize GB300 clusters or architect low-latency fabrics, initiate contact immediately.

SIGNAL: OUTLIER

We are constantly scanning for 10x engineers. If you don't fit a standard role description but can optimize GB300 clusters or architect low-latency fabrics, initiate contact immediately.