Technical Program Manager, Site Reliability Engineering

Google Inc. Kirkland , WA 98034

Posted 2 days ago

Google's projects, like our users, span the globe and require managers to keep the big picture in focus while being able to dive into the unique engineering challenges we face daily. As a Technical Program Manager at Google, you lead complex, multi-disciplinary engineering projects using your engineering expertise. You plan requirements with internal customers and usher projects through the entire project lifecycle. This includes managing project schedules, identifying risks and clearly communicating them to project stakeholders. You're equally at home explaining your team's analyses and recommendations to executives as you are discussing the technical trade-offs in product development with engineers.

Using your technical and leadership expertise, you run Engineering-focused projects that are well-defined under supervision.

The Technical Program Manager (TPM) role within Site Reliability Engineering (SRE) is at the heart of fulfilling SRE's mission: making things faster, more reliable, and preparing for the continued growth of Google's infrastructure. As a TPM, you ensure that systems and services are carefully planned and deployed, taking into account multiple variables such as price, availability and scheduling, while always keeping the bigger picture in mind. You are comfortable driving massive projects which span many teams, have a strong interest in doing the right thing for our users and always think critically and strategically about Google as a business. You are equally at home explaining your analyses and project recommendations to wider audiences as you would be discussing the technical merits of next generation architectures with Google's engineers, or building tools to automate and scale their impact.

Behind everything our users see online is the architecture built by the Technical Infrastructure team to keep it running. From developing and maintaining our data centers to building the next generation of Google platforms, we make Google's product portfolio possible. We're proud to be our engineers' engineers and love voiding warranties by taking things apart so we can rebuild them. We're always on call to keep our networks up and running, ensuring our users have the best and fastest experience possible.

Minimum qualifications:

  • Bachelor's degree in Computer Science, a related field or equivalent practical experience.

  • Experience in Program Management.

  • 5 years of experience in Unix/Linux systems programming with C, C++, Java, Python, Shell and/or Perl.

  • Experience working with code and storage and operating systems.

Preferred qualifications:

  • Experience with the design and architecture of software to improve availability, scalability, latency and efficiency.

  • Experience analyzing global scale distributed systems and critical production service environments.

  • Ability to take initiative, adapt quickly to changing priorities and work with a high sense of urgency with high attention to detail. Ability to interact with technical and non-technical teams.

  • Excellent interpersonal, presentation and communication skills. Effective problem-solving skills.

  • Coordinate with stakeholders to manage, track and control project challenges and ensure timely delivery of products.

  • Identify and/or analyze challenges relating to mission critical services and manage the building of automation tools/processes to prevent recurrence.

  • Engage in a service capacity planning and forecasting software performance analysis and system tuning.

  • Exercise technical judgment to keep goals for programs, projects and products attainable within a given timeline.

  • Influence and manage the creation of new designs, architectures, standards and methods for large-scale distributed systems.

icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Engineering Manager Site Reliability Engineering

Google Inc.

Posted 2 days ago

VIEW JOBS 12/11/2019 12:00:00 AM 2020-03-10T00:00 Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google's services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to users' needs and a fast rate of improvement. Additionally SRE's will keep an ever-watchful eye on our systems capacity and performance. Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you'll have the opportunity to manage the complex challenges of scale which are unique to Google, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. SRE's culture of diversity, intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow. To learn more: check out our books on Site Reliability Engineering "https://landing.google.com/sre/book.html" >Site Reliability Engineering, watch a recorded Hangout on Air to meet some of our SREs, or read a career profile about why a Software Engineer chose to join SRE. As an Engineering Manager, you'll lead a team and be responsible for products globally, providing technical leadership to key projects and empowering and developing teams to do the same. Behind everything our users see online is the architecture built by the Technical Infrastructure team to keep it running. From developing and maintaining our data centers to building the next generation of Google platforms, we make Google's product portfolio possible. We're proud to be our engineers' engineers and love voiding warranties by taking things apart so we can rebuild them. We're always on call to keep our networks up and running, ensuring our users have the best and fastest experience possible. Minimum qualifications: * Experience in software development in one or more of the following: C, C++, Java, Go and/or Perl, Python, Ruby. * Experience managing an engineering team on projects with technical deep-dives into code, networking, operating systems and/or storage. Preferred qualifications: * Bachelor's degree in Computer Science, similar technical field of study, or equivalent practical experience. * Proficiency working with algorithms, data structures and production troubleshooting. * Expertise in problem solving and analyzing global scale distributed systems. * Lead a team of Software/Systems Engineers on projects for users and be directly responsible for uptime. * Own end-to-end availability and performance of key services and build automation to prevent problem recurrence. Automate response to all non-exceptional service conditions. * Lead by example, mentor the team and establish credibility through quality technical execution. * Manage on-call rotations across continents, using a follow-the-sun model. * Design, write and deliver software to improve the availability, scalability, latency and efficiency of Google's services. Google Inc. Kirkland WA

Technical Program Manager, Site Reliability Engineering

Google Inc.