Site Reliability Engineer

Solekai Systems Corp Sacramento , CA 94204

Posted 2 months ago

Are you ready to step up to the New and take your technology expertise to the next level?

Join Accenture and help transform leading organizations and communities around the world. The sheer scale of our capabilities and client engagements and the way we collaborate, operate and deliver value provides an unparalleled opportunity to grow and advance. Choose Accenture and make delivering innovative work part of your extraordinary career.

People in our Client Delivery & Operations career track drive delivery and capability excellence through the design, development and/or delivery of a solution, service, capability or offering. They grow into delivery-focused roles, and can progress within their current role, laterally or upward.

As part of our practice, you will lead technology innovation for our clients through robust delivery of world-class solutions. You will build better software better! There will never be a typical day and that's why people love it here. The opportunities to make a difference within exciting client initiatives are unlimited in the ever-changing technology landscape. You will be part of a growing network of technology experts who are highly collaborative taking on today's biggest, most complex business challenges. We will nurture your talent in an inclusive culture that values diversity. Come grow your career in technology at Accenture!

The Performance Engineering practice within Accenture Technology is focused on optimizing the performance and scalability of enterprise applications through the combination of:

  • Testing: Analyzing, planning and executing production-like simulations across mobile and web solutions to identify & remediate performance problems, prevent production outages, and guarantee predictable performance.

  • Diagnostics & Monitoring: Instrumenting the complete application architecture to provide real user and system performance data to provide insight into the root cause of all application bottlenecks, enable real time visibility to reduce risk exposure.

  • Performance Analytics: Measuring the relationship between end-to-end performance, user behavior, and business goals to maximize the digital business, improve business KPIs, and increase client retention.

  • Business Optimization: Empowering digital businesses with contextual intelligence to visualize, quantify and maximize the business value of performance to improve the quality & performance of the business, increase customer satisfaction, and protect brand reputation.

As a Site Reliability Engineer, some of your key responsibilities may include:

  • Maintain responsibility for the design, deployment, and maintenance of production-scale systems.

  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.

  • Use automation to streamline the provisioning, management, and monitoring of applications and services using multiple scripting languages, Java, and infrastructure-as-code.

  • Facilitate blameless Incident Retrospectives to understand root causes, communicate learnings, determine remediation and make us better and closer as a team.

  • Coordinate with development and platform teams to design and implement zero-downtime deployment approaches, real-time logging, alerting, and monitoring solutions, and code instrumentation.

  • Introduce chaos engineering concepts that promote experimentation in production to identify systemic weaknesses while increasing service resiliency

  • Coordinate with the solution architect to design a highly available solution that meets availability and reliability objectives and to reduce manual activities using automation, when feasible.

  • Identifying, evaluating, and recommending monitoring tools and diagnostic techniques relevant to the application architecture. Assess gaps in as-is monitoring tool capabilities and recommend tools to augment or replace.

  • Instrumenting applications to enable performance diagnostics and monitoring

  • Collaborating with developers to promote the concept of reliability engineering during all phases of the SDLC to detect and correct performance issues earlier in the lifecycle

  • Monitoring application performance during performance tests or production usage through the use of APM and other monitoring tools to isolate the fault domain, dive deep into application code, and identify root cause of performance issues.

  • Interacting with client and/or Accenture development, operations, and infrastructure resources to recommend solutions to remediate performance issues

  • Participating in re-architecture, redesign, and refactoring decisions to satisfy performance requirements

  • Developing dashboards and reports to provide ongoing visibility into the performance of client applications

  • Contributing learnings and experiences to the Accenture Performance Engineering community

  • Requirement of 80-100% travel, typically M-TH

icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Site Reliability Engineering Manager

Two95 International Inc.

Posted 4 weeks ago

VIEW JOBS 4/27/2020 12:00:00 AM 2020-07-26T00:00 <p><strong>Position – </strong><strong>Site Reliability Engineering Manager </strong></p> <p><strong>Location – Sacramento, CA</strong></p> <p><strong>Type – Fulltime</strong></p> <p><strong>Salary – $Market </strong></p> ESSENTIAL JOB FUNCTIONS AND BASIC DUTIES <ul> <li>The SREM will ensure that reliability measures are incorporated into strategic IT plans and that expectations are clearly defined. The SREM will also be responsible for working with business and IT stakeholders to balance real-world risks with business drivers such as speed, agility, flexibility and performance. The ISM's job is composed of a broad range of activities in support of IT program initiatives, including:<ul> <li>Strategic support</li> <li>Reliability liaison</li> </ul> </li> <li>Architecture/engineering support</li> <li>Operational support</li> <li>Work with the Senior Director, Service Delivery to develop a reliability program and projects that address identified risks and platform reliability, automation, and scale requirements.</li> <li>Manage the process of gathering, analyzing and assessing the current and future reliability landscape, as well as providing the Service Delivery Senior Director with a realistic overview of risks in the enterprise environment.</li> <li>Work with the Service Delivery Senior Director to develop budget projections based on short- and long-term goals and objectives.</li> <li>Monitor and report on reliability standards, as well as the enforcement of policies within the IT department.</li> <li>Propose changes to existing policies and procedures to ensure operating efficiency and regulatory compliance.</li> <li>Manage a staff of reliability engineering professionals, hire and train new staff, conduct performance reviews, and provide leadership and coaching, including technical and personal development programs for team members.</li> </ul><p><strong>Requirements</strong></p><h3>Reliability Liaison<br> <br> </h3> <ul> <li>Assist resource owners and IT staff in understanding and responding to reliability concerns experienced. Provide reliability communication, awareness and training for audiences, which may range from senior leaders to field staff. Work as a liaison with vendors and the legal and purchasing departments to establish mutually acceptable contracts and service-level agreements. Manage production issues and incidents and participate in problem and change management forums.</li> <li>Work with various stakeholders to identify information asset owners to classify data and systems as part of a reliability framework implementation. Serve as an active and consistent participant in the systems reliability governance process.</li> <li>Work with the Services Delivery Senior Director and other IT and business stakeholders to define metrics and reporting strategies that effectively communicate successes and progress of the reliability program. Provide support and guidance for legal and regulatory compliance efforts, including audit support.</li> </ul><p><br></p><h3>Architecture/Engineering Support<br> <br> </h3> <ul> <li>Consult with other IT and reliability staff reports to ensure that reliability is factored into the evaluation, selection, installation and configuration of hardware, applications and software. Recommend and coordinate the implementation of technical controls to support and enforce defined reliability practices and policies.</li> <li>Research, evaluate, design, test, recommend or plan the implementation of new or updated reliability hardware or software, and analyze its impact on the existing environment; provide technical and managerial expertise for the administration of reliability tools. Work with the enterprise architecture team to ensure that there is a convergence of business, technical and reliability requirements; liaise with IT management to align existing technical installed base and skills with future architectural requirements.</li> <li>Develop a strong working relationship with the reliability engineering team reporting to this position to develop and implement controls and configurations aligned with reliability policies and legal, regulatory and audit requirements.</li> </ul><p><br></p><h3>Operational Support<br> <br> </h3> <ul> <li>Coordinate, measure and report on the technical aspects of reliability engineering management. Manage outsourced vendors that provide reliability functions for compliance with contracted service-level agreements. Manage and coordinate operational components of incident management, including detection, response and reporting. Maintain a knowledgebase comprising a technical reference library, reliability trends and practices, and laws and regulations. </li> <li>Manage the day-to-day activities of reliability management, identify risk tolerances, recommend treatment plans and communicate information about residual risk. Manage reliability projects and provide expert guidance on reliability matters for other IT projects. Ensure audit trails, system logs and other monitoring data sources are reviewed periodically and are in compliance with policies and audit requirements.</li> <li>Design, coordinate and oversee reliability testing procedures to verify the reliability of systems, networks and applications, and manage the remediation of identified risks.</li> <li>Performs other duties as directed.</li> </ul><p><br></p><p><strong>EDUCATION AND EXPERIENCE:</strong></p> <p></p> <p>Bachelor’s or Master’s degree in Reliability Engineering, Computer Science, Information systems, or related discipline, plus a minimum of seven years of IT experience, five years of which must be in a reliability engineering role, and at least two years in a supervisory capacity, or an equivalent combination of education and experience.</p><p><strong>Benefits</strong></p><p><strong>Note:</strong> If interested please send your updated resume and include your salary requirement along with your contact details with a suitable time when we can reach you. If you know of anyone in your sphere of contacts, who would be a perfect match for this job then, we would appreciate if you can forward this posting to them with a copy to us.</p> <p>We look forward to hearing from you at the earliest!</p> Two95 International Inc. Sacramento CA

Site Reliability Engineer

Solekai Systems Corp