Site Reliability Engineer

Jpmorgan Chase & Co. Palo Alto , CA 94306

Posted 2 weeks ago

As a Site Reliability Engineer (SRE), you'll help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Much of our support and software development focuses on optimizing existing systems, building infrastructure and reducing work through automation. You'll join a team of curious problem solvers with a diverse set of perspectives who are thinking big and taking risks. In this environment, you'll take the lead on relevant projects, supported by an organization that provides the support and mentorship you need to learn and grow. As an SRE, you'll be focused on running better production applications and systems.


  • Develop, test, and debug automated tasks (Apps, Systems, Infrastructure)

  • Troubleshoot priority incidents, facilitate blameless post-mortems

  • Work with development teams throughout the software life cycle ensuring sustainable software releases

  • Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions

  • Build and drive adoption for greater self-healing and resiliency patterns

  • Lead and participate in performance tests; identify bottlenecks, opportunities for optimization, and capacity demands

  • Participate in the 24x7 support coverage as needed

Required Skills:

  • Bachelor's degree in Computer Science, Information Technology, or equivalent technical field

  • 6 or more years relevant engineering experience

  • In-Depth OS experience (RHEL, Ubuntu, Windows Server) with strong debugging, troubleshooting, and problem-solving skills

  • Experience in site reliability engineering in one of the following languages: Python or Java

  • Hand-on experience with cloud-based technologies and tools especially in deployment, monitoring and operations, such as Data Dog, Prometheus, Splunk, Elasticsearch or Grafana

  • Strong working knowledge of modern development technologies and tools such Agile, CI/CD, Git, Terraform and Jenkins

Additional Preferred Skills:

  • AWS/Kubernetes certification is highly desirable

  • 2 or more years of Enterprise Cloud infrastructure experience (AWS, Azure or GCP) in a mission critical environment

  • Experience in GO, powershell or shell scripting

  • Deep knowledge of Internet protocols and web services technologies such as HTTP, DNS, TCP/UDP, SOAP, JSON and REST

  • Good understanding of networking protocols and cybersecurity best practices in cloud environment

icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Site Reliability Engineer


Posted 3 weeks ago

VIEW JOBS 7/4/2021 12:00:00 AM 2021-10-02T00:00 At TripActions, we are constantly striving to make the most reliable and scalable systems possible to ensure that our platform is available to our travelers when they need it most. With our exponential growth, we have many exciting challenges up ahead. We are expanding our Site Reliability Engineering team to tackle these obstacles, and provide world class availability to our travellers. We are looking for a passionate Site Reliability Engineer to design and develop the tooling, automation and infrastructure services that power the TripActions Liquid app used by thousands of travelers on a daily basis. You will work most closely with the Liquid engineering team, and will have many friendly peers and cross-functional partners in the Amsterdam and Palo Alto offices to see and work with regularly. The impact you'll make: Building a fast moving, high growth service. TripActions Liquid is revolutionizing expenses, and the product is evolving quickly. You are comfortable in a startup environment, enjoy seeing the product take shape, and have strong ownership of the success of your services. Designing, implementing and operating cloud infrastructure. You're a fit for us if you think in terms of infrastructure as code, deployment pipelines, and building the guardrails to make going fast also going safely. Identifying reliability anti-patterns and solving them systemically. You dive deep into the data to evaluate the health of your systems, and you use it to improve visibility and reliability across the fleet of services. Finding and automating the toil out of our processes. You'd prefer to automate it entirely, or build a tool to empower your users rather than be the gatekeeper to the tool. What We're Looking For: * 5+ years of experience as an SRE (or Infrastructure Software Engineer, or DevOps Engineer) * Building and operating distributed systems in AWS, using CI/CD to ship code to production using tools such as maven and Jenkins. * Experience with microservice architecture and related reliability patterns such as throttling, queueing, and retries * Writing Infrastructure as Code in Terraform or Cloudformation * You have been automating away manual tasks using python, bash and ruby * Building, using, and automating monitoring systems such as SignalFX, Kibana, Grafana * Strong sense of ownership demonstrated through shipping production-quality code and infrastructure equipped with testing, monitoring and documentation * Passion for solving problems and learning new tools and technologies * Excellent communication skills working with stakeholders and domain experts across the company to design solutions to user problems * Ability to thrive in a fast-paced environment Additional Awesomeness: * Experience with Java based applications and services including jvm profiling and performance tuning * Experience building CI/CD pipelines from scratch and scaling them up * Database experience with RDS (mysql), couchbase and/or elasticsearch Tripactions Palo Alto CA

Site Reliability Engineer

Jpmorgan Chase & Co.