Quantitative Development - DevOps Engineer
Job Function Summary
The Central Liquidity Strategies (CLS) business manages a number of portfolios and products designed to optimize the firm’s trading and execution approach by providing internal liquidity solutions for portfolio managers on both a risk and agency basis.
We are seeking a highly driven, results-oriented, and opinionated dev ops leader with experience in handling research infrastructure, deploying critical applications, and operating on large amounts of data to create battle-tested infrastructure and improve the research development experience.
Principal Responsibilities
-
Leadership: the candidate will design and implement infrastructure, and advise on and enforce best practices to maximize research and development velocity
-
Research: the candidate is expected to keep up with the state-of-the-art tools that are being used in the field and continuously evaluate what the best tools and practices are for our use cases
-
Machine Learning Operations (MLOps) + Development Experience (DevEx):
-
create dependable and reproducible polyglot (Python, native extensions, CUDA) environments for rapidly iterating research projects that can be easily deployed to prod
-
Enforce best practices and packaging standards for large research codebases
-
Work with the cloud to help scale research jobs
-
Infrastructure automation:
-
develop CI/CD pipelines for research processes and live trading apps
-
develop robust monitoring solutions for infrastructure and deployed applications
-
automate recurring jobs with tools like Airflow/Prefect
-
Performance engineering:
-
Be familiar with best practices for profiling, monitoring performance to assist with performance investigations
-
Develop solutions that empower researchers and developers to understand the performance of their code
Qualifications/Skills Required
-
Experience: 7 years + of experience with research focused DevOps (HPC, ML research, quant research) and experience with high-availability production deployments
-
Strong communications skills and ability to work with many stakeholders in a team environment
-
Leadership skills: ability to work with constraints, make decisions under time pressure, and own your work
-
Development skills: Experience writing clean, robust, and testable code for automating processes pertaining to infrastructure management and deployment
-
Systems knowledge:
-
familiarity with Linux internals
-
understanding of package management, how software is deployed on systems
-
Python:
-
Strong understanding of Python internals
-
Familiarity with the latest standards in the packaging ecosystem (uv), build tools like hatchling
-
Familiarity with tools like: Nix, Conda, Pixi, Kubernetes and containers