Jpmorgan Chase & Co. Plano , TX 75023
Posted 1 week ago
JobID: 210509812
Category: Software Engineering
JobSchedule: Full time
Posted Date: 2024-04-22T22:19:26+00:00
JobShift: Day
Base Pay/Salary: Jersey City,NJ $171,000.00-$260,000.00
Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.
As a Senior Lead Site Reliability Engineer at JPMorgan Chase within the CORPORATE SECTOR in the INFRASTRUCTURE PLATFORMS, Runtime Compute Team, you are deemed as a force multiplier at both a line-of-business and firm wide level. Inspire your peers and the wider product line to deliver durable and resilient products and services to our customers, define firm wide strategies for reliability, and guide and entrust our teams to lead and execute those strategies.
Job responsibilities
Provide technical SRE leadership for multiple SRE teams, engineers, and managers throughout Runtime Compute who look to you for advice on the technical issues facing them.
You are a key influencer in the Runtime Compute strategic resiliency, observability, and toil reduction planning.
You drive continual improvement in resilience, quality of experience, security, monitoring, instrumentation, and automation.
You have successfully implemented SRE best practices in high-performance, stable, mission-critical applications with demonstrable positive outcomes.
Technologists in Runtime Compute look to you for advice on technical and business issues facing them.
You work with your fellow stakeholders to define common NFRs and availability targets for your product line, Runtime Compute, and ensure that SRE is practiced consistently across applications, products, and product lines.
You act in a blameless, data-driven manner, show high empathy, emotional intelligence, and can navigate difficult situations with composure and tact.
Direct the SRE teams in the product line throughout the lifecycle to help develop software for reliability and scale, ensuring consistency across the product line, and minimal refactoring or changes.
Direct the SRE teams in the product line to develop and measure the SLO/SLI for provisioning/deprovisioning, deployments, uptime, and other measures critical to products. Work with business partners to help educate on the product line SLO/SLI.
Identify gaps between applicable requirements and current procedures/controls; Drive resolution of mitigating controls. Develop and implement solutions that strengthen business operating models, enhance the client experience, and improve efficiency and controls.
Work with business partners to design and implement enhancements to existing processes and/or business applications, introduce new processes and/or toolsets, and engage in process re-engineering.
Required qualifications, capabilities, and skills
Formal training or certification on software engineering concepts and 5+ years applied experience with Industry standard Runtime solutions eg Kubernetes and Cloud Foundry.
Expertise in at least one technology stack designing, coding, testing and delivering software.
Proficiency in one or more technology domains, may be cross-domain expert to able to solve complex and mission critical problems within a business or across the firm. Software development experience in at least one general purpose programming language: Python, Java, C, C++, Go, Shell scripting.
Working knowledge infrastructure component ( E.g. Load balancer, cloud platforms and products, container systems, and runtime compute).
Excellent debugging and troubleshooting skills.
Strong organizational and prioritization skills, detail-oriented and strong interpersonal skills.
Be a team player and a leader who shows commitment and dedication, and can maintain a positive attitude and high-level of performance on high-profile/time-sensitive initiatives
Preferred qualifications, capabilities, and skills
Experience hiring, developing, and recognizing talent
Ability to work in a high paced environment, be flexible, follow tight deadlines, organize and prioritize work
Hands-on experience with cloud-based observability technologies and tools especially in deployment, monitoring and operations, such as Data Dog, Prometheus, Splunk, ElasticSearch, Grafana, appdynamics etc
Strong working knowledge of modern development technologies and tools such Agile, CI/CD, Git, Terraform and Jenkins
Deep knowledge of Internet protocols and web services technologies such as HTTP, DNS, TCP/UDP, SOAP, JSON and REST
Good understanding of networking protocols and cybersecurity best practices in cloud environment. Public Cloud certification is preferred
AI/ML knowledge is preferred to evaluate and choose models that help with SRE goals including automated root cause analysis, anomaly detection, and real-time insights and analytics into various products.
#LI-RB3
Jpmorgan Chase & Co.