Site Reliability Engineer
Boston , MA 02108
This Job is not relevant Tell us why
Site Reliability EngineerBoston, MA12-month contract-to-hireOur Opportunity:
Site Reliability Engineers are a cross between system and software engineers who are responsible for all operational aspects of our clients ecommerce platform. The team is responsible for designing, building, monitoring, and maintaining the infrastructure of our internet-facing and internal services. We're looking for engineers who want to be a part of developing infrastructure software, maintaining it, and scaling the clients technology stack.
Ideal candidates will possess the ability to discuss complex technical concepts with a diverse audience across all areas of the organization. They will remain calm under pressure and always strive to add structure to high-pressure, fast paced tasks or projects.What you'll do:
- Focus on service stability and reliability by working with application owners to set SLOs, "Error Budget" and backup and DR strategies
- Define application monitoring and alerting strategy
- Perform capacity planning and production readiness assessment
- Embed with product teams during the design and requirements phase of new product development through to initial production launch
- Identify requirements for other operational teams (release engineering, automation, etc.) during application development phase
- Be a technology and Devops evangelist for the rest of the company
- Participate in on-call rotation for level 3 support escalations
- What you'll need:
- At least 5 years of experience working in an SRE role or similar.
- Hands on experience with orchestration and system configuration tools such as Ansible, Puppet, Chef, Terraform, etc.
- Expert in building and maintaining highly available applications including redundancy, fail over, scalability, monitoring and performance.
- Strong experience with virtualization, monitoring and automation.
- Software development experience (both scripting and programming languages).
- Experience working with open source community (troubleshooting, patch submission, etc.).
- Demonstrated 5+ years of Linux System Administration.
- Experience with CI tools such as Bamboo, Jenkins, Hudson.
- Ability to organize, troubleshoot and continuously learn.
- Previous experience working within controls such as SOX, PCI, etc.
- This position requires travel.