As the world's leading provider of cloud-based software and technology solutions delivered by managed service providers (MSPs), Datto believes there is no limit to what small and medium businesses can achieve with the right technology. Datto offers Unified Continuity, Networking, and Business Management solutions and has created a one-of-a-kind ecosystem of MSP partners. These partners provide Datto solutions to over one million businesses across the globe. Since its founding in 2007, Datto continues to win awards each year for its rapid growth, product excellence, superior technical support, and for fostering an outstanding workplace. With headquarters in Norwalk, Connecticut, Datto has global offices in the United Kingdom, Netherlands, Denmark, Germany, Canada, Australia, China, and Singapore. Learn more at datto.com.
We're looking for a motivated, self-starting, Sr. Site Reliability Engineer to help pioneer this role at Datto. The Sr. Site Reliability Engineer attaches to our Core Products Team, which maintains and develops new features for all of Datto's backup appliances (~75K devices and growing quickly). The backup device is a physical or virtual appliance that takes block-level backups of Windows, Mac, and Linux machines, turns them into raw disk images and stores them on a local ZFS-based disk array. In the case of a disaster, our customers restore these backups/disk-images instantly as KVM-based virtual machines, iSCSI targets, Samba shares, and many other formats. We also offer a virtual VMware/Hyper-V-based appliance and integrate with their hypervisors. We write code in modern Symfony-based PHP (with some Python and C++ sprinkled in), and we strongly rely on our Ubuntu-based Linux stack. We do amazing and exciting things every day, such as detecting when a VM has booted successfully, injecting drivers into the Windows registry before boot, and generating vmdk files on the fly. On top of that, we work with many low-level technologies, such as hypervisors and the ZFS filesystem. This is not your average PHP webdev gig! You will report to the Sr. Director of Software Engineering.
Does This Describe You:
You're a technical expert!
A Look Inside the Job:
Collaborate with Product and Software Development teams to determine the Core products reliability strategy including Service Level Objectives (SLOs) and Indicators (SLIs)
Guide product reliability improvement through monitoring, alerting, and application of software development best practices
Collect SLI metrics and establish monitoring based on SLO thresholds and other product requirements
Establish and configure transaction volume, traffic, performance, and error rate monitoring including alert thresholds, capacity planning, and performance impact analysis
You will participate in SRE software engineering, writing code for the continuing reduction of human intervention in operational tasks and automation of processes
Troubleshoot complex issues quickly and effectively
Develop a balanced on-call program with appropriate staffing
Communicate with Users, Support, and Development teams in the event of an incident
Diagnose and develop root cause solutions for failures and performance issues in our production environment
Bachelor's degree in Computer Science or equivalent experience
Strong root cause analysis and troubleshooting competency
Experience working with automation and data-driven analysis
Experience with OOP languages such as Java, PHP, C#, or C++
Solid understanding of Objection Oriented Programming fundamentals
Summary of benefits not showing up? View a summary here: Datto Benefits
By submitting an application, you acknowledge we will process your data in order to consider you for the position you apply for and for other open positions within our company for which you may be suited. We collect and store your data in accordance with our Recruiting Privacy Practices.
Datto is an equal opportunity employer.