Senior Reliability Engineer - Remote Opportunity

Kohl's Corp. Menomonee Falls , WI 53051

Posted 3 weeks ago

A Full Stack Reliability Engineer at Kohl's is an engineer who has a deep level of knowledge in systems, software engineering and associated automation, tooling and processes. They possess a breadth and depth of knowledge that allows them to iteratively improve the operability, observability, reliability, scalability and performance of the systems to reduce the operational overhead, reduce risks and simplify the ecosystem. They drive operational excellence across Kohl's by enabling Balanced Product Teams and other Partner Teams to up-level the health of their services in production, improve reliability, and empowering them to self-serve and run their services by having strong partnerships and continuous collaboration.

JOB RESPONSIBILITIES

  • Follows software lifecycle, driving reliability, observability, and efficiency across product teams within your domain

  • Identifies repeated toil and finds opportunities for automation and risk reduction

  • On-call on a rotation to respond to production incidents and conduct blameless retros and root-cause analysis (RCAs) to drive a culture of continuous improvements

  • Proactively identifies failures before it becomes an outage using chaos engineering techniques such as edge cases, failure modes, and DR

  • Advises on capacity planning and provides continuous assessments on systems behavior and consumption working towards optimization and cost savings

  • Works with product managers to identify and prioritize tech debt for reliability best practices (e.g. SLIs/SLOs/Error Budgets)

  • Mentors and assists engineers on team

QUALIFICATIONS

REQUIRED

  • Have strong programming skills in one or more languages
  • Java, Python, Go or Node.js
  • In-depth knowledge of application design patterns, event-driven architecture, database schemas, and testing strategies

  • Bachelor's Degree or equivalent in MIS, Computer Science or related field

  • 4+ years of experience in software development

  • Experience with large scale application troubleshooting and performance tuning

  • Experience working with one of major cloud platforms (GCP, AWS, or Azure)

  • Experience in one of more Observability platforms

  • Prometheus, InfluxDB, Grafana, ELK or APM

PREFERRED

  • In-depth knowledge and experience with continuous integration, continuous deployment, and test driven development

  • Experience in at least one PasS & Containers

  • Openshift, Cloud Foundry, Kubernetes or equivalent
  • Experience with one or more configuration management systems like Chef, Ansible, Puppet

  • In-depth understanding of systems architecture, UNIX internals, networking topologies, multi-cluster applications, multi-tenant platforms, and systems/network security

icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Reliability Engineer Remote Opportunity

Kohl's Corp.

Posted 3 weeks ago

VIEW JOBS 9/25/2021 12:00:00 AM 2021-12-24T00:00 Job Description Summary A Full Stack Reliability Engineer at Kohl's is an engineer who has a deep level of knowledge in systems, software engineering and associated automation, tooling and processes. They possess a breadth and depth of knowledge that allows them to iteratively improve the operability, observability, reliability, scalability and performance of the systems to reduce the operational overhead, reduce risks and simplify the ecosystem. They drive operational excellence across Kohl's by enabling Balanced Product Teams and other Partner Teams to up-level the health of their services in production, improve reliability, and empowering them to self-serve and run their services by having strong partnerships and continuous collaboration. JOB RESPONSIBILITIES * Follows software lifecycle, driving reliability, observability, and efficiency across product teams within your domain * Identifies repeated toil and finds opportunities for automation and risk reduction * On-call on a rotation to respond to production incidents and conduct blameless retros and root-cause analysis (RCAs) to drive a culture of continuous improvements * Proactively identifies failures before it becomes an outage using chaos engineering techniques such as edge cases, failure modes, and DR * Advises on capacity planning and provides continuous assessments on systems behavior and consumption working towards optimization and cost savings * Works with product managers to identify and prioritize tech debt for reliability best practices (e.g. SLIs/SLOs/Error Budgets) QUALIFICATIONS REQUIRED * Bachelor's Degree or equivalent in MIS, Computer Science or related field * 2+ years of experience in software development * Have strong programming skills in one or more languages - Java, Python, Go or Node.js * Experience working with one of major cloud platforms (GCP, AWS, or Azure) PREFERRED * Experience in one of more Observability platforms - Prometheus, InfluxDB, Grafana, ELK or APM * Knowledge of application design patterns, event-driven architecture, database schemas, and testing strategies * Experience with large scale application troubleshooting and performance tuning * Knowledge and experience with continuous integration, continuous deployment, and test driven development * Experience in at least one PasS & Containers - Openshift, Cloud Foundry, Kubernetes or equivalent * Experience with one or more configuration management systems like Chef, Ansible, Puppet * Good understanding of systems architecture, UNIX internals, networking topologies, multi-cluster applications, multi-tenant platforms, and systems/network security Kohl's Corp. Menomonee Falls WI

Senior Reliability Engineer - Remote Opportunity

Kohl's Corp.