Sr. Site Reliability Engineer
Weston , MA 02493
This Job is not relevant Tell us why
Senior Site Reliability Engineer
Our clientis looking for a Senior Site Reliability Engineer to join their team in Weston, MA!The Site Reliability Engineer (SRE) will work with other members of the SRE team supporting software engineers to build highly reliable and performing infrastructure. Typical projects will include developing automated solutions for operational aspects like capacity planning, performance and improving site reliability. This position will also function as a hands-on technical lead by mentoring other team members and evangelizing with other groups.Responsibilities
- Hands-on design, analysis and troubleshooting of highly-distributed large-scale production systems;
- Ownership of reliability, uptime, capacity, and performance analysis thereof
- Ensuring the repeatability, traceability, and transparency of our infrastructure automation
- Identifying highest-impact opportunities to optimize existing systems
- System design consulting for teams seeking to leverage or improve their production infrastructure
- Anticipate, build and plan capacity for upcoming product/feature launches
- Practice sustainable incident response and blameless postmortems
- 8+ years of experience
- Bachelor's degree in Computer Science, a related technical field involving systems engineering (e.g., Physics or Mathematics) or equivalent practical experience.
- Experience in one or more of the following: C, C++, Java, Python, Go, Perl, Ruby or shell scripting, Yaml, Json.
- Experience with Unix/Linux operating systems internals and administration (e.g., filesystems, inodes, system calls, etc) or networking (e.g., TCP/IP, routing, network topologies and hardware, SDN, etc.). AWS Services (i.e.CloudFormation, CloudWatch, EKS, Landing Zone, Administration, etc.), GCP Services (i.e. Data Flow, SubPub, BigQuery, BigTable, etc.)
- Expertise in designing, analyzing and troubleshooting large-scale distributed systems.
- Experience in Terraform
- Ability to debug and optimize code and to automate routine tasks.
- Systematic problem-solving approach, coupled with effective communication skills and a sense of ownership and drive
- Strong problem solving, root cause analysis and systems engineering skills
- Good presentation and communication skills