Site Reliability Engineering (Sre) Manager

Censys Ann Arbor , MI 48103

Posted 7 days ago

Censys knows the internet and cloud better than anyone else. Attack Surface Management provides customers with an attacker-centric view of all externally facing internet and cloud to extend visibility, prioritize, and remediate the most critical risk exposures that will actually lead to a breach. Our daily IPv4 scans and the world's largest SSL/TLS Certificate database enable customers with the most accurate and continuously updated attack surfaces. Enterprise security teams leverage Censys to keep pace with the speed of the business and gain an advantage over the rapidly evolving cyber-attack threats.

Role Summary:

Censys is looking for an SRE Manager to join our organization to lead our Infrastructure & Ops team. Our engineers need a manager who will push them to self-organize, to grow technically, to grow professionally, and to be there to help them remove roadblocks. Managers at Censys understand that only by trying to distribute responsibility and obsolete ourselves do we actually empower our teams and gain enough time so we can effectively identify new opportunities to improve the organization.

Our Infrastructure & Ops team is responsible for managing basic developer experience, strategizing with development teams on projects, operating and maintaining our cloud and applications environment, and operating our on-premise co-located data centers for our scanning and data teams.

The role of the SRE Manager is to participate in the daily execution of the team, from daily standup to planning, refining, and prioritizing of engineering requirements and deliverables. They seek to empower success by ruthlessly prioritizing and stakeholding with the other engineering teams and business partners to achieve the goals of the greater Censys product roadmap. They will build trust with stakeholders and partners through diplomacy, discussion, and follow-through. This is a broad cross-organization role with high visibility, collaborating with multiple teams. They are expected to invest in and build good relations with key partners. Their collaboration with internal customers, product engineering, and development groups is critical to success.

Qualifications:

  • 5+ years of experience supporting large-scale, distributed systems, combining hardware and cloud experience.

  • 3+ years of experience building and leading engineering teams; ideally SRE, Infrastructure, or Production engineering teams.

  • Superb interpersonal skills, capable of working with multi-functional technical and business teams and varying levels of management, influencing decision-making.

  • Understanding of SRE principles, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts, with a keen eye for opportunities to eliminate toil by code and process improvements.

  • Experience supporting Software Engineering teams through a Platform methodology, embracing a Dev-Sec-Ops culture to allow development teams to own their services end-to-end from development to production.

  • Emphasis on SRE as an engineering subject area, preferably with proficiency in software engineering and development practices.

  • Experience running or facilitating a production outage escalation and incident response program, and coordinating vulnerability and patch management with IT and Corporate Security teams.

  • Optional experience operating hybrid on-premise and cloud environments, with experience managing data center operations and bare metal environments, including high availability data center design, hardware capacity planning, L2-L3 network design and security, ISP peerings, and disaster recovery.

  • Optional experience with Kubernetes-based application environments, including maintaining and scaling clusters that contain several hundreds of nodes, and facilitating complex distributed systems and software through mechanisms such as auto-scaling.

  • Optional experience managing a GCP-based cloud architecture, including efforts around reducing unnecessary cloud spend on under-utilized or poorly optimized infrastructure.

Our target salary range for this role is between $160,000 USD and $225,000 USD + bonus eligibility and equity.

Our roots are in Ann Arbor, Michigan with location hubs in Seattle, the Bay Area, Washington D.C., and Dublin, Ireland. Our innovation is fueled by the team's global perspectives and diverse backgrounds.

Don't meet every single requirement? Studies have shown that women and people of color are less likely to apply to jobs unless they feel they meet every qualification. At Censys we are dedicated to building a diverse, inclusive, and authentic workplace - so if you're excited about this role but your past experience doesn't align perfectly with every listed requirement in the job description, we encourage you to apply anyways. You may be exactly who we need to fill this role or others!

We value diversity and are committed to creating an inclusive environment for all employees. Censys is an equal opportunity employer.


icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove

Site Reliability Engineering (Sre) Manager

Censys