Reliability Engineer

Sr Media Buyer Delray Beach , FL 33444

Posted 4 months ago

The Site Reliability Engineer is responsible to provide technical expertise, education, and tooling to ensure the highest level of reliability and availability for critical applications. This role combines software and system engineering that delivers high quality solutions to operations problems. You have an organic approach of our systems and can re-engineer processes, you are able to stay focus when production services are down and maintain the work as a team when needed to bring them back up.



Responsibilities:

  • Responsible for driving automation, efficiencies to increase quality, availability and security.
  • Partners to ensure efficiencies in increasing quality, availability and security to technical platforms. Works individually and with teams to drive reliability goals and objectives across platforms.
  • Develops solutions to increase service stability through automation and process re-engineering
  • Analyze service performance to improve quality and customer experience
  • Builds and supports tools and systems that software engineers use to deploy their software into production.
  • Helps development teams operationalize their efforts to enable self-ownership of production services.
  • Responsible for evaluating and implementing automation and tooling solutions to ensure consistent processes and repetitive tasks are performed with a higher level of accuracy and reduced defects.
  • Build, implement and advise on recovery tooling to adhere to enterprise standards and/or frameworks.
  • Introduce new and impactful technologies to the production support tool chain that help minimize friction for production releases and support, and to more quickly diagnose and recover from production incidents.
  • Provide regular reports in a timely manner, including weekly activity report.
  • Other activities assigned by management.

Qualifications

  • Bachelor’s Degree or equivalent experience.
  • 5+ years of experience in system administration or SRE.
  • AWS on a production level.
  • Strong programming and scripting background.
  • Work well individually as well as with a team in a fast-paced environment, quick learning and excellent problem-solving skills.
  • Solid experience installing and administering some sort of mainstream Linux operating system (Amazon Linux or other Red Hat derivative preferred).
  • Experience with opensource Puppet/Ansible or another configuration management suite.
  • Ability to anticipate, identify and resolver customer communication problems.
  • Excellent communication oral and written and ability to effectively communicate with diverse backgrounds and levels of the organization.
  • Excellent attention to details and able to work independently with minimal supervision.
  • IoT industry experience is desired, but not required.
  • Ability to be on-call.
icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Reliability Engineer

Channel Staffing

Posted 2 weeks ago

VIEW JOBS 11/1/2019 6:34:56 PM 2020-01-30T18:34 <p>The Site Reliability Engineer is responsible to provide technical expertise, education, and tooling to ensure the highest level of reliability and availability for critical applications. This role combines software and system engineering that delivers high quality solutions to operations problems. You have an organic approach of our systems and can re-engineer processes, you are able to stay focus when production services are down and maintain the work as a team when needed to bring them back up.</p> <p><strong>Responsibilities:</strong></p> <ul> <li>Responsible for driving automation, efficiencies to increase quality, availability and security.</li> <li>Partners to ensure efficiencies in increasing quality, availability and security to technical platforms. Works individually and with teams to drive reliability goals and objectives across platforms.</li> <li>Develops solutions to increase service stability through automation and process re-engineering</li> <li>Analyze service performance to improve quality and customer experience</li> <li>Builds and supports tools and systems that software engineers use to deploy their software into production.</li> <li>Helps development teams operationalize their efforts to enable self-ownership of production services.</li> <li>Responsible for evaluating and implementing automation and tooling solutions to ensure consistent processes and repetitive tasks are performed with a higher level of accuracy and reduced defects.</li> <li>Build, implement and advise on recovery tooling to adhere to enterprise standards and/or frameworks.</li> <li>Introduce new and impactful technologies to the production support tool chain that help minimize friction for production releases and support, and to more quickly diagnose and recover from production incidents.</li> <li>Provide regular reports in a timely manner, including weekly activity report.</li> <li>Other activities assigned by management.</li> </ul> <p><strong>Qualifications</strong></p> <ul> <li>Bachelor’s Degree or equivalent experience.</li> <li>5+ years of experience in system administration or SRE.</li> <li>AWS on a production level.</li> <li>Strong programming and scripting background.</li> <li>Work well individually as well as with a team in a fast-paced environment, quick learning and excellent problem-solving skills.</li> <li>Solid experience installing and administering some sort of mainstream Linux operating system (Amazon Linux or other Red Hat derivative preferred).</li> <li>Experience with opensource Puppet/Ansible or another configuration management suite.</li> <li>Ability to anticipate, identify and resolver customer communication problems.</li> <li>Excellent communication oral and written and ability to effectively communicate with diverse backgrounds and levels of the organization.</li> <li>Excellent attention to details and able to work independently with minimal supervision.</li> <li>IoT industry experience is desired, but not required.</li> <li>Ability to be on-call.</li> </ul> Channel Staffing Delray Beach FL

Reliability Engineer

Sr Media Buyer