C the SignsDevOps

Senior MLOps Engineer

United StatesPosted today

We’re hiring a Senior MLOps Engineer with deep machine learning engineering experience to build and operate the production platform powering ML/LLM-driven healthcare workflows. You’ll design reliable, secure, and compliant systems for model development, evaluation, deployment, monitoring, and continuous improvement—working closely with ML, data, security, and product teams.

Location: United States

Responsibilities

Design and operate ML platforms that support end-to-end workflows: data ingestion, feature engineering, training, evaluation, deployment, and monitoring.
Build and maintain CI/CD for ML (testing, packaging, versioning, reproducibility, automated rollbacks, approvals).
Implement MLOps best practices: model registry, experiment tracking, lineage, governance, and reproducible training environments.
Develop scalable training infrastructure (distributed training, GPU scheduling, cost controls, auto-scaling).
Create and maintain feature pipelines / feature stores, ensuring consistency between training and inference (training-serving skew prevention).
Establish model monitoring and observability: performance, drift, bias/fairness signals (where relevant), latency, throughput, and data quality.
Build and own end-to-end LLM delivery pipelines: prompt/versioning, retrieval, orchestration, evaluation, deployment, monitoring, and iterative improvement.
Create robust LLM evaluation harnesses (offline + online): golden datasets, automated regression testing, human-in-the-loop review workflows, and risk scoring.
Build cost controls: token/cost budgeting, caching strategies, autoscaling, and performance tuning.

Requirements

6+ years in software/platform engineering, including 4+ years operating ML systems in production (or equivalent depth).
Strong experience in ML engineering: training pipelines, evaluation, deployment patterns, monitoring, and iteration loops.
Strong engineering skills in Python, plus production-grade experience building APIs/services.
Demonstrated hands-on experience with LLM systems in production and ML engineering: training pipelines, evaluation, deployment patterns, monitoring, and iteration loops.
Strong experience with GCP services and cloud-native patterns.
Experience with Vertex AI (pipelines, endpoints, feature store, model registry, evaluation) and/or managed vector search on GCP.
Experience with containerization and orchestration (Docker, Kubernetes/GKE and/or Cloud Run).

Benefits

Competitive salary and benefits package.
Flexible working arrangements (remote or hybrid options available).
The opportunity to work on life-changing AI technology that directly impacts patient outcomes.
Join a team that combines cutting-edge innovation with a mission to save lives and improve health equity.
Continuous learning opportunities with access to the latest tools and advancements in AI and healthcare.

Apply Now

Location

United States