Site Reliability Engineer

Solekai Systems Corp San Jose , CA 95111

Posted 2 months ago

Are you ready to step up to the New and take your technology expertise to the next level?

Join Accenture and help transform leading organizations and communities around the world. The sheer scale of our capabilities and client engagements and the way we collaborate, operate and deliver value provides an unparalleled opportunity to grow and advance. Choose Accenture and make delivering innovative work part of your extraordinary career.

People in our Client Delivery & Operations career track drive delivery and capability excellence through the design, development and/or delivery of a solution, service, capability or offering. They grow into delivery-focused roles, and can progress within their current role, laterally or upward.

As part of our practice, you will lead technology innovation for our clients through robust delivery of world-class solutions. You will build better software better! There will never be a typical day and that's why people love it here. The opportunities to make a difference within exciting client initiatives are unlimited in the ever-changing technology landscape. You will be part of a growing network of technology experts who are highly collaborative taking on today's biggest, most complex business challenges. We will nurture your talent in an inclusive culture that values diversity. Come grow your career in technology at Accenture!

The Performance Engineering practice within Accenture Technology is focused on optimizing the performance and scalability of enterprise applications through the combination of:

  • Testing: Analyzing, planning and executing production-like simulations across mobile and web solutions to identify & remediate performance problems, prevent production outages, and guarantee predictable performance.

  • Diagnostics & Monitoring: Instrumenting the complete application architecture to provide real user and system performance data to provide insight into the root cause of all application bottlenecks, enable real time visibility to reduce risk exposure.

  • Performance Analytics: Measuring the relationship between end-to-end performance, user behavior, and business goals to maximize the digital business, improve business KPIs, and increase client retention.

  • Business Optimization: Empowering digital businesses with contextual intelligence to visualize, quantify and maximize the business value of performance to improve the quality & performance of the business, increase customer satisfaction, and protect brand reputation.

As a Site Reliability Engineer, some of your key responsibilities may include:

  • Maintain responsibility for the design, deployment, and maintenance of production-scale systems.

  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.

  • Use automation to streamline the provisioning, management, and monitoring of applications and services using multiple scripting languages, Java, and infrastructure-as-code.

  • Facilitate blameless Incident Retrospectives to understand root causes, communicate learnings, determine remediation and make us better and closer as a team.

  • Coordinate with development and platform teams to design and implement zero-downtime deployment approaches, real-time logging, alerting, and monitoring solutions, and code instrumentation.

  • Introduce chaos engineering concepts that promote experimentation in production to identify systemic weaknesses while increasing service resiliency

  • Coordinate with the solution architect to design a highly available solution that meets availability and reliability objectives and to reduce manual activities using automation, when feasible.

  • Identifying, evaluating, and recommending monitoring tools and diagnostic techniques relevant to the application architecture. Assess gaps in as-is monitoring tool capabilities and recommend tools to augment or replace.

  • Instrumenting applications to enable performance diagnostics and monitoring

  • Collaborating with developers to promote the concept of reliability engineering during all phases of the SDLC to detect and correct performance issues earlier in the lifecycle

  • Monitoring application performance during performance tests or production usage through the use of APM and other monitoring tools to isolate the fault domain, dive deep into application code, and identify root cause of performance issues.

  • Interacting with client and/or Accenture development, operations, and infrastructure resources to recommend solutions to remediate performance issues

  • Participating in re-architecture, redesign, and refactoring decisions to satisfy performance requirements

  • Developing dashboards and reports to provide ongoing visibility into the performance of client applications

  • Contributing learnings and experiences to the Accenture Performance Engineering community

  • Requirement of 80-100% travel, typically M-TH

icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Site Reliability Engineer

Splunk

Posted 3 days ago

VIEW JOBS 5/29/2020 12:00:00 AM 2020-08-27T00:00 The Cloud organization at Splunk focuses on building and maintaining robust and resilient platform solutions for SaaS hosting of Splunk's enterprise software. Our main technologies are Cloud Infrastructure based, focusing on puppet and terraform. The SRE Operations team is globally distributed with teams based in San Francisco and Plano in the USA, Sydney in Australia, and London in the UK.® The SRE Ops team works closely with our Support and Software engineering teams so you'll have plenty of chances to interact with and learn from other teams across the business as well as your direct colleagues on the other SRE teams. WHAT WE'RE LOOKING FOR Splunk's Cloud group is looking for a Site Reliability Engineer to help maintain, contribute to and improve the next generation of our large scale Cloud offering. You will be working with large scale cloud providers and supporting the infrastructure that powers Splunk's cloud offering. YOU SHOULD APPLY IF: * You have operational experience at scale. You have had hands on roles that deal with operating systems (particularly Linux) and networking. You might also have worked with Cloud technologies. Your previous job titles might be something close to systems admin, network engineer or devops engineer. * You're passionate about your work. Our customers are passionate about Splunk and we want the same from our engineers. You should enjoy actively being responsible for your work and be excited about your projects. * You love large complex systems. Experience in working on distributed systems or a passion for finding edge cases that appear at scale. You are interested in how to bring something from a small one off task to how to implement it across several thousand machines at once. * You have some development skills. We have code in several languages, ranging from Python and Shell to Go and C++. We don't expect you to be a software engineer but you should be familiar with basic programming and understand concepts like input sanitisation and unit testing. * "How can I automate this process?" is a question you constantly ask yourself. * Data drives your decisions. Data excites you and you make decisions based on numbers rather than assumptions. If an issue arises, you strive to be alerted before our customers notice. * You care about monitoring. Shipping code often and getting useful feedback excites you and you're not worried about changing direction when a solution isn't working as expected. WHAT WE PROVIDE * Opportunities to develop and grow as an engineer. We are always expanding into new areas, working with open-source projects and contributing back, and exploring new technologies. * A team of incredibly capable and dedicated peers, all the way from engineering to product management and customer support. * Breadth and depth. You are interested to work in an area that dynamically scales to meet the need of Splunk's cloud offering. You want to go deep into optimizing how we automate every manual process and tedious task we encounter. * Growth and mentorship. We believe in growing engineers through ownership and leadership opportunities. We also believe that mentors help both sides of the equation. * A stable, collaborative, and supportive work environment. Honesty and collaboration are values we see as a core part of our team identity. We understand the value in open communication-working together to get things done, and to adapt to the changing needs of the team and individuals. This is reflected in both our internal communications and also in how we interact with our customers. * Balance. We don't expect people to work 12 hour days. We want you to be successful outside of work too. Want to work from home sometimes? No problem. We trust our colleagues to be responsible with their time and commitment, and believe that balance helps cultivate a positive environment. * Fun. Work doesn't have to always be about working. We organize frequent team lunches and other optional team building events to help you get to know other members of the team in a more relaxed environment. We value diversity at our company. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or any other applicable legally protected characteristics in the location in which the candidate is applying. Thank you for your interest in Splunk! Splunk San Jose CA

Site Reliability Engineer

Solekai Systems Corp