We are looking for aGPU Performance Engineerto build highly optimized CUDA kernels for low-latency inference. This role is focused on workloads where off-the-shelf runtimes and vendor libraries do not fully exploit the structure of the model, and where custom kernels, memory layouts, and execution strategies can deliver meaningful gains.You will work closely with quantitative researchers and engineers to understand model structure,identifycomputational bottlenecks, and turn mathematical ideas into production-grade GPU implementations. You will use your understanding of GPU hardware to help shape models that are both mathematically effective and efficient to run. The problems span compact neural networks, tree-based models, and other structured inference workloads where latency, throughput, and efficiency all matter.This role is a strong fit for someone who enjoys low-level optimization, performance analysis, and translating abstract models into hardware-efficient code.What you'll doDesign, implement, and optimize custom CUDA kernels for latency-critical inference workloadsDevelop fine-grained GPU implementations tailored to specific model structuresAnalyze quantitative research models and computational bottlenecks to identify opportunities for parallelization and hardware-efficient executionCollaborate directly with quantitative researchers to translate mathematical models into high-performance compute pipelinesOptimize end-to-end inference performance through kernel tuning, memory-layout design, execution strategy, I/O optimization, and precision tradeoffsProfile and benchmark GPU performanceImprove latency and throughput in production inference systemsContribute to GPU architecture decisions and performance best practicesStrong proficiency in writing and optimizing CUDA kernelsSolid programming experience in C/C++ (preferred)Deep understanding of GPU architecture, including memory hierarchy, SIMT execution, occupancy, and latency/throughput tradeoffsAbility to reason about numerical stability, precision, performance tradeoffs, and how model design choices affect hardware efficiencyStrong problem-solving skills and comfort working with low-level systemsPreferred qualificationsPhD in mathematics, physics, computer science, engineering, or related quantitative fieldStrong background in linear algebra, probability, numerical methods, or scientific computingExperience working with quantitative research teams or financial modelsDemonstrated ability to improve real-world inference performance beyond baseline framework or library implementationsFamiliarity with PTX-level behavior, tensor core utilization, or architecture-specific tuningExposure to ONNX Runtime, TensorRT, Triton, TVM, or similar systemsExposure to neural networks, tree-based models (e.g., LightGBM), state space models (e.g., Mamba architectures), and experience with kernel fusion, custom operators, model compilation, or graph-level optimizationThe annual base pay range for this role is $200,000 - $300,000 + discretionary bonus + benefits. Susquehanna considers factors such as scope and responsibilities of the position, work experience, education/training, key skills, as well as market and organizational considerations when extending an offer.About SusquehannaSusquehanna is a global quantitative trading firm powered by scientific rigor, curiosity, and innovation. Our culture is intellectually driven and highly collaborative, bringing together researchers, engineers, and traders to design and deploy impactful strategies in our systematic trading environment. To meet the unique challenges of global markets, Susquehanna applies machine learning and advanced quantitative research to vast datasets in order to uncover actionable insights and build effective strategies. By uniting deep market expertise with cutting-edge technology, we excel in solving complex problems and pushing boundaries together.If you're a recruiting agency and want to partner with us, please reach out resume or referral submitted in the absence of a signed agreement will not be eligible for an agency fee.#LI-KH2#LI-Onsite
Job Title
GPU Performance Engineer | Experienced Hire