workfromanywhereworkfromanywhere
All jobs
Wikimedia FoundationEngineering

Senior Site Reliability Engineer, Infrastructure Foundations

Remote (US)$113k–$175kPosted today

The Wikimedia Foundation is seeking a Senior Site Reliability Engineer to support and develop the platform for Wikipedia, ensuring its reliability and scalability. The role involves working with a globally distributed, open-source, and remote-first team to maintain Wikimedia’s top-10 website.

Location: Remote (US)

Salary: $113k–$175k

Responsibilities

  • Performing day-to-day operational/DevOps tasks on Wikimedia’s public facing infrastructure (deployment, maintenance, configuration, troubleshooting)
  • Implementing and utilizing configuration management and deployment tools (Puppet, Kubernetes)
  • Leading continuous improvement, by automating the installation, configuration and maintenance of services on our platform
  • Work closely with product teams helping them bring scalable functionality to our users by assisting in the architectural design of new services and making them operate at scale
  • Participating in a 24/7 on-call rotation shared across the broader SRE team. This includes taking part in incident response, diagnosis and follow-up on system outages or alerts across Wikimedia’s production infrastructure.
  • Collaborating with a global, cross-functional team in an asynchronous communication environment
  • Mentoring peers in your areas of technical and operational strength
  • Ability and willingness to travel 1-2 times a year for in-person events and team meetings

Requirements

  • 6+ years of experience in an SRE/Operations/DevOps role as part of a team
  • Experience with shell and any scripting languages used in an SRE context (Python, Go, Bash, Ruby; we primarily use Python) and configuration management tools (Puppet, Ansible; we use Puppet)
  • Experience designing and managing infrastructure security for large fleets of diverse services
  • Experience with technical response during security incidents
  • Experience with package management on Linux systems (we use Debian)
  • Strong Linux system-level troubleshooting skills
  • History of automating tasks and processes, identifying process gaps, and finding automation opportunities
  • Strong English language skills (verbal and written) and ability to work independently, as an effective part of a globally distributed team working across multiple time zones
  • Experience leading and participating in incident response and post-incident review rituals, with the goal of conducting root cause analysis and implementing preventive measures

Location

Remote (US)

Salary

$113k–$175k

Category

Engineering

Source

himalayas

Posted

today

Similar remote jobs

Incite InsightNewEngineering

Implementation Consultant - £70,000 Pa

Remote£70k
today
today

Senior Site Reliability Engineer (Data & Automation Focus)

UK Remote
today
CenturyLinkNewEngineering

Senior Construction Engineer

Multiple states in the US (see salary ranges by state)$84,629 - $124,122 (location-based pay ranges)
today
JobRackNewEngineering

Senior Backend Drupal Developer

Remote$5000 - $6000 per month
today