Site Reliability Engineer

Solekai Systems Corp Cleveland , OH 44114

Posted 2 months ago

Are you ready to step up to the New and take your technology expertise to the next level?

Join Accenture and help transform leading organizations and communities around the world. The sheer scale of our capabilities and client engagements and the way we collaborate, operate and deliver value provides an unparalleled opportunity to grow and advance. Choose Accenture and make delivering innovative work part of your extraordinary career.

People in our Client Delivery & Operations career track drive delivery and capability excellence through the design, development and/or delivery of a solution, service, capability or offering. They grow into delivery-focused roles, and can progress within their current role, laterally or upward.

As part of our practice, you will lead technology innovation for our clients through robust delivery of world-class solutions. You will build better software better! There will never be a typical day and that's why people love it here. The opportunities to make a difference within exciting client initiatives are unlimited in the ever-changing technology landscape. You will be part of a growing network of technology experts who are highly collaborative taking on today's biggest, most complex business challenges. We will nurture your talent in an inclusive culture that values diversity. Come grow your career in technology at Accenture!

The Performance Engineering practice within Accenture Technology is focused on optimizing the performance and scalability of enterprise applications through the combination of:

  • Testing: Analyzing, planning and executing production-like simulations across mobile and web solutions to identify & remediate performance problems, prevent production outages, and guarantee predictable performance.

  • Diagnostics & Monitoring: Instrumenting the complete application architecture to provide real user and system performance data to provide insight into the root cause of all application bottlenecks, enable real time visibility to reduce risk exposure.

  • Performance Analytics: Measuring the relationship between end-to-end performance, user behavior, and business goals to maximize the digital business, improve business KPIs, and increase client retention.

  • Business Optimization: Empowering digital businesses with contextual intelligence to visualize, quantify and maximize the business value of performance to improve the quality & performance of the business, increase customer satisfaction, and protect brand reputation.

As a Site Reliability Engineer, some of your key responsibilities may include:

  • Maintain responsibility for the design, deployment, and maintenance of production-scale systems.

  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.

  • Use automation to streamline the provisioning, management, and monitoring of applications and services using multiple scripting languages, Java, and infrastructure-as-code.

  • Facilitate blameless Incident Retrospectives to understand root causes, communicate learnings, determine remediation and make us better and closer as a team.

  • Coordinate with development and platform teams to design and implement zero-downtime deployment approaches, real-time logging, alerting, and monitoring solutions, and code instrumentation.

  • Introduce chaos engineering concepts that promote experimentation in production to identify systemic weaknesses while increasing service resiliency

  • Coordinate with the solution architect to design a highly available solution that meets availability and reliability objectives and to reduce manual activities using automation, when feasible.

  • Identifying, evaluating, and recommending monitoring tools and diagnostic techniques relevant to the application architecture. Assess gaps in as-is monitoring tool capabilities and recommend tools to augment or replace.

  • Instrumenting applications to enable performance diagnostics and monitoring

  • Collaborating with developers to promote the concept of reliability engineering during all phases of the SDLC to detect and correct performance issues earlier in the lifecycle

  • Monitoring application performance during performance tests or production usage through the use of APM and other monitoring tools to isolate the fault domain, dive deep into application code, and identify root cause of performance issues.

  • Interacting with client and/or Accenture development, operations, and infrastructure resources to recommend solutions to remediate performance issues

  • Participating in re-architecture, redesign, and refactoring decisions to satisfy performance requirements

  • Developing dashboards and reports to provide ongoing visibility into the performance of client applications

  • Contributing learnings and experiences to the Accenture Performance Engineering community

  • Requirement of 80-100% travel, typically M-TH

icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Senior Site Reliability Operations Specialist (ECommerce)

Sherwin-Williams

Posted 1 week ago

VIEW JOBS 5/27/2020 12:00:00 AM 2020-08-25T00:00 Founded in 1866, The Sherwin-Williams Company is a global leader in the manufacture, development, distribution, and sale of paints, coatings and related products to professional, industrial, commercial, and retail customers. The company manufactures products under well-known brands such as Sherwin-Williams®, Valspar®, HGTV HOME® by Sherwin-Williams, Dutch Boy®, Krylon®, Minwax®, Thompson's® Water Seal®, Cabot® and many more. Sherwin-Williams® branded products are sold exclusively through a chain of more than 4,100 company-operated stores and facilities, while the company's other brands are sold through leading mass merchandisers, home centers, independent paint dealers, hardware stores, automotive retailers, and industrial distributors. The company supplies a broad range of highly-engineered industrial and OEM coatings for wood and general industrial, coil, packaging, protective and marine, and transportation applications worldwide. Our 60,000 employees are diverse, innovative and passionate. With a variety of rewarding and challenging opportunities, Sherwin-Williams is a great place to find a career that takes you places. The E-Commerce Senior Site Reliability Operations Specialist position focuses on detection, remediation and prevention of incidents ensuring maximum availability and reliability for our users and customers. The position requires strong business process and technology knowledge coupled with an operational excellence mindset that supports a best in class customer experience. The role is responsible for working with the business and technical teams to identify applications, features and integrations that should be monitored. Creating monitoring dashboard and generating reports to increase visibility for KPIs. Develop and enforce service level agreements (SLAs) with key stakeholders. Define and enforce critical incident response processes including handling communication to business and technical stakeholders.. Define problem management processes to prevent future incidents by prioritizing and completing root cause analysis (RCA). Ensure non-critical production issues are routed and triaged to the appropriate teams. Train on new features released to customers to understand site functionality. This is an individual contributor position. Essential Functions Incident Management * Initial incident management triage and ticket assignment to the appropriate team. Define and enforce across the IT E-Business COE critical incident response processes. * Define and enforce service level agreements between the provider and the customer that defines incident priorities, escalation paths, and response/resolution time frames. * Front line communication of high and critical incidents to key stakeholders. * Categorization of incident types for better data gathering and problem management. * Ensure Incident closure and documentation. * Interact with customer facing teams to address questions and problems. Problem Management * Define problem management processes to prevent future incidents by prioritizing and completing root cause analysis (RCA). * Work with business and IT staff to understand the impact and priority of the problem. * Oversee plan development and execution for problem resolution. * Ensure progress on problems being addressed. * Proactively work with engineers to identify and remediate single points of failure. Monitoring and Reporting * Identify applications, features, functions and integrations that should be monitored. * Partner with technical teams to ensure identified items are monitored. * Creation and oversight of monitoring dashboard. * KPI and Incident report generation for increased visibility. * Collaborate to define alerting thresholds are in place and relevant. Incidental Functions * Work in a hybrid waterfall / agile development environment. * Conduct research into new technologies, including tools, components, and frameworks. * Perform task management and reporting as necessary. * Provide tier 2, on-call support for critical deployment problems and issues. * Assist with other projects as may be required to contribute to efficiency and effectiveness of the work. * Work outside the standard office 7.5 hour workday as required. * Coordinate and drive disaster recovery activities as needed * Up to 10% travel is required. Position Requirements Formal Education & Certification * Bachelor degree or foreign equivalent in a related field or equivalent experience. Knowledge & Experience * 5 years IT experience. * 5 years IT operational support experience. * 5 years experience in customer service related work. * 2 years hands-on experience working with incident management. * A proven track record working with incident management tools and concepts. * Experience using agile project management tools (such as Rally or JIRA). * Working knowledge of Microsoft Office Suite. * Experience in developing operational metrics and data. Preferred Qualifications and Skills * Experience with Agile and Waterfall development and release practices. * Experience with application monitoring software. * Experience with IT KPI reporting. * Experience influencing and negotiating in a professional environment. * Ability to chair, facilitate and lead meetings. Personal Attributes * Strong written and oral communications skills. * Proven ability and initiative to learn and research new concepts, ideas, and technologies quickly. * Strong systems/process orientation with demonstrated analytical thinking, organization skills and problem solving skills. * Ability to work in a team-oriented, collaborative environment. * Ability to quickly pick up new tools and technologies. * Willingness and ability to train and teach others. * Ability to facilitate meetings and follow up with resulting action items. * Ability to prioritize and execute tasks in a high-pressure environment. * Strong presentation and interpersonal skills. * Ability to work effectively in a multi-cultural environment, and to lead and influence cross-organizationally with and without direct authority. * Ability to effectively move forward on tasks even with ambiguous or changing requirements. Must be legally authorized to work in country of employment without sponsorship for employment visa status now or in the future. Equal Opportunity Employer. All qualified candidates will receive consideration for employment and will not be discriminated against based on race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability, age, pregnancy, genetic information, creed, citizenship status, marital status, or any other consideration prohibited by law or contract. VEVRAA Federal Contractor requesting priority referral of protected veterans. Sherwin-Williams Cleveland OH

Site Reliability Engineer

Solekai Systems Corp