Principal Site Reliability Engineer - SRE (Remote/Flexible)

Insulet Corporation Acton , MA 01720

Posted 2 months ago

Insulet started in 2000 with an idea and a mission to enable our customers to enjoy simplicity, freedom and healthier lives through the use of our Omnipod product platform. In the last two decades we have improved the lives of hundreds of thousands of patients by using innovative technology that is wearable, waterproof, and lifestyle accommodating.

We are looking for highly motivated, performance driven individuals to be a part of our expanding team. We do this by hiring amazing people guided by shared values who exceed customer expectations. Our continued success depends on it!

Job Title: Principal Site Reliability Engineer -SRE

Department: CTO

Manager/Supervisor: Sr. Director, Site Reliability Engineering

Position overview:

Insulet started in 2000 with an idea and a mission to enable our customers to enjoy simplicity, freedom, and healthier lives using our Omnipod product platform. In the last two decades we have improved the lives of hundreds of thousands of patients by using innovative technology that is wearable, waterproof, and lifestyle accommodating.

As a Principal Engineer in Site Reliability Engineering (SRE) at Insulet, you will play a critical role in architecting, implementing, and maintaining highly available and scalable infrastructure and systems. You will lead a team of SRE engineers, driving best practices, develop a culture of automation, and ensuring the reliability of our services. This role requires a hands-on approach to solving complex technical challenges while providing technical leadership to the team.

JOB / DUTIES / RESPONSIBILITIES

  • Provide technical guidance and mentorship to the SRE team.

  • Drive the implementation of best practices in reliability, scalability, and performance.

  • Lead by example, demonstrating excellence in technical skills and problem-solving.

  • Collaborate with cross-functional teams to design scalable, resilient, and efficient systems.

  • Architect and implement infrastructure solutions that meet the requirements of high availability and performance.

  • Drive the adoption of modern technologies and tools to improve system reliability and efficiency.

  • Develop and maintain automation tools for provisioning, deployment, and monitoring.

  • Automate routine tasks to improve operational efficiency and reduce manual intervention.

  • Design and implement monitoring solutions to proactively identify issues and prevent service disruptions.

  • Lead incident response efforts, conducting post-mortem analysis, and implementing measures to prevent recurrence.

  • Develop & Automate runbooks and playbooks to streamline incident resolution processes.

  • Conduct capacity planning exercises to ensure systems can handle current and future loads.

  • Identify performance bottlenecks and optimize system performance through tuning and optimization efforts.

  • Collaborate with development teams to design and implement scalable architectures.

  • Document system architectures, configurations, and procedures.

  • Promote knowledge sharing within the team through technical presentations, workshops, and documentation.

Skills/Experience

  • Bachelor's in computer science, Engineering, or a related field.

  • 15 years of experience in the field including 5+ Site Reliability Engineering, DevOps, or a similar role.

  • Proven experience architecting and managing highly available, scalable, and fault-tolerant systems.

  • Proficiency in scripting and programming languages such as Python, Go, or similar.

  • Strong understanding of cloud computing platforms (e.g., AWS, Azure, GCP) and container orchestration technologies (e.g., Kubernetes).

  • In-Depth knowledge of AWS services including VPC, Lambda, IAM, ELB, EC2, ECS, CloudWatch, API Gateway, S3, SQS, SNS, WAF, X-Ray, and Route53

  • Experience with infrastructure as code tools such as Terraform, Ansible, or similar.

  • Excellent troubleshooting and problem-solving skills.

  • Strong communication and leadership skills, with the ability to collaborate effectively with cross-functional teams.

  • Experience leading and mentoring engineering teams is highly desirable.

NOTE: This position is eligible for 100% remote working arrangements (may work from home/virtually 100%; may also work hybrid on-site/virtual as desired). #LI-Remote #LI-AS1

Additional Information:

The US base salary range for this full-time position is $168,900.00 - $253,800.00. Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position in the primary work location in the US. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your Talent Acquisition Specialist can share more about the specific salary range for your preferred location during the hiring process. Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits.

At Insulet Corporation all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.

(Know Your Rights)


icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Principal Site Reliability Engineer - SRE (Remote/Flexible)

Insulet Corporation