Sr Site Reliability Engineer

Alaska Airlines Seattle , WA 98113

Posted 3 days ago


We're creating an airline people love. It begins with each Alaska Airlines employee, bringing unique strengths and energy to our work in the air and on the ground. Every day, we go beyond what's expected and reach for the remarkable, together.


Role Summary
The e-commerce department of Alaska Airlines is seeking an experienced Senior Site Reliability Engineer (Sr. SRE) to be responsible for the reliability, resiliency, and performance of the technology systems supporting our multibillion, multi-channel e-commerce business. This role is part of a functional team that owns Tier 2 and Tier 3 support for all e-commerce systems including, customer mobile apps, loyalty systems, and our back-end tier of large scale, distributed and highly available services. The position is highly technical and balances between engineering operations and software development to enable rapid product development.

The ideal candidate will have hands on coding and scripting experience in the areas of infrastructure automation and instrumenting health monitors. They can build creative engineering solutions to operation problems and understand the big picture of how systems relate to each other. They will eliminate manual work through automation and partner with our Development teams to ensure that services are designed and delivered to be mission critical with a focus on security, resiliency, scale, and performance. They are familiar with a DevOps culture and work to spread a DevOps culture to their own team and others. They understand agile development values and practices including small, iterative, frequent, and continuous delivery of value.

Scope & Complexity
Under limited supervision this individual contributor supports and develops tools and processes that ensure resiliency, performance of the systems and enhance proactive monitoring, automation, and overall system health of Alaska Airlines? e-commerce business.

Key Duties

  • Guide and train agile engineering teams to optimize service quality and ensure adoption of reliability best practices.

  • Introduce and evaluate cutting edge software tools that pushes our core tech stack forward and improves the reliability and stability of our site.

  • Leads the team to collect metrics, crunch data, build dashboards and improve service monitoring to detect problems before customer is impacted.

  • Drives a continuous improvement mindset with the team, embracing a DevOps culture by automating everything possible and constantly finding ways to make our systems more reliable.

  • Understands, experiments, and adopts emerging industry practices in the systems operations space.

  • Practices, coaches, and evangelizes reliability best practices.

  • Works with product teams to establish SLAs around performance that can then be integrated into our monitoring/alerting solutions.

  • Automates existing manual processes and provides more self-service functionality to Tier 2 team.

  • Develops engineering solutions to repetitive failures and other problems that adversely affect production systems.

  • Practices agile principles to organize and deliver work.

  • Brings modern delivery practices to legacy systems.

  • Enables software development teams to continuously push their code to production.

  • Helps build container based software delivery to production.

Job-Specific Experience, Education & Skills

  • A minimum of 5 years of hands-on software development experience.

  • A minimum of 3 years of Reliability Engineering experience.

  • Experience with Git.

  • Proficiency in infrastructure scripting and configuration automation tools (Chef/Bamboo/Jenkins).

  • Experience in Windows Azure /AWS.

  • Expertise in monitoring tools (AppDynamics/App Insights/Sumo Logic/etc.).

  • Expertise in incident and problem management including timely problem identification, successful resolution, and root-cause analysis.

  • Strong verbal and written communication skills to communicate technology concepts and practices.

  • Experience working in a high-scale, high-traffic, 24/7 environment.

  • High school diploma or equivalent is required.

  • Minimum age of 18.

  • Must be authorized to work in the U.S.


  • A Bachelor of Arts or a Bachelor of Science degree, with a focus in computer science or similar technical field, is strongly preferred.

  • Experience with test-driven development (TDD), unit testing, pair/mob programming and other Extreme Programming (XP) techniques.

  • Expertise with modern design principles, such as the development and utilization of cloud APIs, single-page web apps, hybrid mobile development, and SOLID principles.

  • Experience in Agile/Lean development methodologies.

  • Experience in the airline and/or hospitality industry.

Job-Specific Leadership Expectations
Embody our values to own safety, do the right thing, be kind-hearted, deliver performance, and be remarkable.




Submit your application by 11/29/2018 11:59pm (Pacific Time). We'll be happy to see it.


Horizon Air and Alaska Airlines are equal opportunity employers. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, gender expression, national origin, age, protected veteran or disabled status, or genetic information.

Horizon Air and Alaska Airlines will consider for employment qualified applicants with arrest and conviction records in accordance with applicable Federal, State, and local laws.

Horizon Air and Alaska Airlines participate in E-Verify, a service of the Department of Homeland Security (DHS) and Social Security Administration (SSA), where required.

  • Job ID:* 32498

  • Location:* Seattle, WA

  • Full/Part Time:* Full-Time

  • Regular/Temporary:* Regular

upload resume icon
See if you are a match!

See how well your resume matches up to this job - upload your resume now.

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Lead Site Reliability Engineer (Sre)


Posted 3 days ago

VIEW JOBS 1/17/2019 12:00:00 AM 2019-04-17T00:00 <p>With the typical enterprise today using more than 1,200 cloud applications, integration between applications, data, processes and people is more critical than ever - this is why cloud integration is the fastest growing market in enterprise software today. Azuqua sits right at the center of the digital transformation, powering the future of work by connecting cloud applications and automating the flow of work across teams, enabling companies to deliver better experiences for their customers and employees. We’re trusted by innovative companies such as Airbnb, Adobe, HubSpot, VMWare, Zendesk and hundreds more with their most critical business processes.</p><p>We are looking for an experienced Lead Site Reliability Engineer (SRE) to help us build and scale our highly diverse platform. This position offers both unique learning and exciting career growth opportunities in a fast growing enterprise SaaS. You will be entrusted and empowered to:</p><ul> <li>Deliver the most reliable and scalable production that automates and empowers enterprises around the world and meet/exceed their rigorous SLAs </li> <li>Improve the operational efficiency of our infrastructure</li> <li>Guide, mentor and grow SREs on the team</li> <li>Work closely with business and engineering to define platform requirements, technical architecture and service operations</li> <li>Evangelize and adopt industry best practices</li> </ul><p><br></p><p><strong>Requirements</strong></p><ul> <li>Strong understanding of scaling systems reliably</li> <li>Production experience with with data stores and/or async messaging (exmples: MySQL, MongoDB, Redis, Memcache, Kafka, Elastic Search, etc)</li> <li>Exposure to containers (Docker, Kubernetes)</li> <li>Experience with an automation/configuration management tools</li> <li>Proficiency in the use of code and script (examples: Python, Ruby, Java and/or Go)</li> <li>Familiarity with Amazon Web Services (AWS), Microsoft Azure or Google Cloud Platform (GCP)</li> <li>Familiarity with cutting-edge open source libraries and experience contributing to projects of personal interest a plus<br> </li> <li>2+ years experience in leading teams</li> </ul> Azuqua Seattle WA

Sr Site Reliability Engineer

Alaska Airlines