Site Reliability Engineer

Oath Lockport , IL 60441

Posted 3 months ago

It takes powerful technology to connect our brands and partners with an audience of 1 billion. Nearly half of Verizon Media employees are building the code and platforms that help us achieve that. Whether you're looking to write mobile app code, engineer the servers behind our massive ad tech stacks, or develop algorithms to help us process 4 trillion data points a day, what you do here will have a huge impact on our businessand the world. Want in? As Verizon's media unit, our brands like Yahoo, TechCrunch and HuffPost help people stay informed and entertained, communicate and transact, while creating new ways for advertisers and partners to connect. With technologies like XR, AI, machine-learning, and 5G, we're transforming media for tomorrow, too. We're creators and coders, dreamers and doers creating what's next in content, advertising and technology.

The Service Reliability Engineering SRE team at Verizon Media is the power behind engineering goodness for our production systems. By writing, designing and implementing software to drive velocity, operability, reliability and performance, the SRE team ensures continuous quality on production systems as we embed deeply in the many layers and stages of software development.

We are all about: 1) Enabling a culture of ownership and excellence, 2) Engineering process that are Automated and Agile, 3) Developing tools that are Self-Serve and (Re)Usable.

Our mission is to: 1) Deliver products to market quickly, 2) Prevent defects from reaching customers, 3) Repair production issues quickly. If you believe in the above, come join us.

At Verizon Media we want engineers that are self starters and problem solvers with the ability to do so with new and legacy code. Using innovative ideas to solve complex issues, while integrating easily with the running ecosystem, is a key trait that will fit very well in the organization.

Responsibilities

  • Apply broader knowledge of property in remediating complex system issues thus reducing escalation to the development team.

  • Develop creative and practical solutions to resolve non-routine problems in my own area of expertise at SRE through analysis of various property and application dependencies.

  • Auto remediation - Building tools, writing Kubernetes plugins, and finding new, open-source tools.

  • Provides guidance and technical advice to the Ops team and get involved as required, to resolve medium and higher severity incidents.

  • Trains, guides, and delegates work to the Operations team by breaking down information in a systematic and communicable manner from a leadership position.

  • Develop tools to automate the deployment, administration, and monitoring of a large-scale Linux environment.

  • Work with development teams to harden, enhance, document, and generally improve the operability of our systems.

  • Provide a rapid response to escalations, leading to a decrease in response time and the Mean-time-to-resolution (MTTR).

  • Aggressively troubleshoot and multitask incidents of varying difficulty and priority with a focus on prioritization of tasks, ensuring that higher priority items are addressed first.

Minimum Qualifications

  • BS in Computer Science or 5 years in a related technical field.

  • Excellent hands-on Linux or Unix or any similar variants; both administration and internals.

  • Strong troubleshooting and problem-solving skills, including application and network-level troubleshooting ability.

  • Experience working with large scale production deployments of thousands of servers.

  • Hands-on experience working with config management tools like Ansible, Chef or packer

  • Experience in driving incident resolution and in addressing technical aspects (during and after) to both tech and non-technical audiences;

  • Should be able to interpret system condition by looking at system stats/profiles (e.g. CPU, Memory, Swap, disk capacity).

  • Experience programming in at least one of the following languages: C, C++,Java, Python, or Go.

Preferred Qualifications

  • Experience with build & deploy technologies such as version control, Maven, Docker, Chef.

  • Experience in administering, debugging and tuning web apps and containers and related tech;

  • Deep understanding of UNIX/Linux system internals and tools for troubleshooting application stack dumps and networking;

  • Experience in supporting cloud-based virtualization and platforms, big data technologies (Apache Storm, Hadoop, Hbase) is a strong plus;

Verizon Media is proud to be an equal opportunity workplace. All qualified applicants will receive consideration for employment without regard to, and will not be discriminated against based on age, race, gender, color, religion, national origin, sexual orientation, gender identity, veteran status, disability or any other protected category. Verizon Media is dedicated to providing an accessible environment for all candidates during the application process and for employees during their employment. If you need accessibility assistance and/or a reasonable accommodation due to a disability, please submit a request via the Accommodation Request Form (https://www.verizonmedia.com/careers/contact-us.html) or call 408-336-1409. Requests and calls received for non-disability related issues, such as following up on an application, will not receive a response.

Currently work for Verizon Media? Please apply on our internal career site.


icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Site Reliability Engineer

Verizon Media (Former Oath)

Posted 3 months ago

VIEW JOBS 11/9/2019 12:00:00 AM 2020-02-07T00:00 It takes powerful technology to connect our brands and partners with an audience of 1 billion. Nearly half of Verizon Media employees are building the code and platforms that help us achieve that. Whether you're looking to write mobile app code, engineer the servers behind our massive ad tech stacks, or develop algorithms to help us process 4 trillion data points a day, what you do here will have a huge impact on our business—and the world. Want in? As Verizon's media unit, our brands like Yahoo, TechCrunch and HuffPost help people stay informed and entertained, communicate and transact, while creating new ways for advertisers and partners to connect. With technologies like XR, AI, machine-learning, and 5G, we're transforming media for tomorrow, too. We're creators and coders, dreamers and doers creating what's next in content, advertising and technology. The Service Reliability Engineering SRE team at Verizon Media is the power behind engineering goodness for our production systems. By writing, designing and implementing software to drive velocity, operability, reliability and performance, the SRE team ensures continuous quality on production systems as we embed deeply in the many layers and stages of software development. We are all about: 1) Enabling a culture of ownership and excellence, 2) Engineering process that are Automated and Agile, 3) Developing tools that are Self-Serve and (Re)Usable. Our mission is to: 1) Deliver products to market quickly, 2) Prevent defects from reaching customers, 3) Repair production issues quickly. If you believe in the above, come join us. At Verizon Media we want engineers that are self starters and problem solvers with the ability to do so with new and legacy code. Using innovative ideas to solve complex issues, while integrating easily with the running ecosystem, is a key trait that will fit very well in the organization. Responsibilities * Apply broader knowledge of property in remediating complex system issues thus reducing escalation to the development team. * Develop creative and practical solutions to resolve non-routine problems in my own area of expertise at SRE through analysis of various property and application dependencies. * Auto remediation - Building tools, writing Kubernetes plugins, and finding new, open-source tools. * Provides guidance and technical advice to the Ops team and get involved as required, to resolve medium and higher severity incidents. * Trains, guides, and delegates work to the Operations team by breaking down information in a systematic and communicable manner from a leadership position. * Develop tools to automate the deployment, administration, and monitoring of a large-scale Linux environment. * Work with development teams to harden, enhance, document, and generally improve the operability of our systems. * Provide a rapid response to escalations, leading to a decrease in response time and the Mean-time-to-resolution (MTTR). * Aggressively troubleshoot and multitask incidents of varying difficulty and priority with a focus on prioritization of tasks, ensuring that higher priority items are addressed first. Minimum Qualifications * BS in Computer Science or 5 years in a related technical field. * Excellent hands-on Linux or Unix or any similar variants; both administration and internals. * Strong troubleshooting and problem-solving skills, including application and network-level troubleshooting ability. * Experience working with large scale production deployments of thousands of servers. * Hands-on experience working with config management tools like Ansible, Chef or packer * Experience in driving incident resolution and in addressing technical aspects (during and after) to both tech and non-technical audiences; * Should be able to interpret system condition by looking at system stats/profiles (e.g. CPU, Memory, Swap, disk capacity). * Experience programming in at least one of the following languages: C, C++,Java, Python, or Go. Preferred Qualifications * Experience with build & deploy technologies such as version control, Maven, Docker, Chef. * Experience in administering, debugging and tuning web apps and containers and related tech; * Deep understanding of UNIX/Linux system internals and tools for troubleshooting application stack dumps and networking; * Experience in supporting cloud-based virtualization and platforms, big data technologies (Apache Storm, Hadoop, Hbase) is a strong plus; Verizon Media is proud to be an equal opportunity workplace. All qualified applicants will receive consideration for employment without regard to, and will not be discriminated against based on age, race, gender, color, religion, national origin, sexual orientation, gender identity, veteran status, disability or any other protected category. Verizon Media is dedicated to providing an accessible environment for all candidates during the application process and for employees during their employment. If you need accessibility assistance and/or a reasonable accommodation due to a disability, please submit a request via the Accommodation Request Form (https://www.verizonmedia.com/careers/contact-us.html) or call 408-336-1409. Requests and calls received for non-disability related issues, such as following up on an application, will not receive a response. Currently work for Verizon Media? Please apply on our internal career site. Verizon Media (Former Oath) Lockport IL

Site Reliability Engineer

Oath