We're looking for a Site Reliability Engineering Director to work within the Global Mobile Engineering organization and lead an Engineering team responsible for mobile app performance, availability and reliability.
You'll be expected to work with several Technology partners, and Product Managers to help actively identify areas of opportunity within the availability platform and build a vision for the next generation platform, technology and constant innovations. In addition you will engage in hands-on design and ensure alignment of strategy, architecture, tools/methods with software engineers and architects. You will be responsible for opening up the boundaries in monitoring, tooling, and resolving in our efforts to maximize the performance and availability of our mobile applications.
You should be familiar with modern Software Development methodologies, and be able to dive deep and rapidly iterate on ideas despite ambiguity. Make no mistake - this is an opportunity to work in one of the best Technology units which help lead risk for American Express and influence how millions of people interact with their cards, their merchants and their money.
BS or MS degree in computer science, computer engineering, or other technical discipline, or equivalent 3-6 years of work experience
Aptitude for learning and applying programming concepts
Detailed understanding of application flows, Proactive monitoring needs of production systems
In-depth knowledge of ITIL concepts such as Incident, Change, Problem management and support procedures
Ability to effectively communicate with internal and external business partners and technology teams
Very strong technical troubleshooting and analytical skills with the ability to resolve infrastructure (cloud) and application issues in Production environment
Direct application monitoring and work towards implementing automated monitoring scripts
Expertise with Splunk programming - writing queries, building dashboards, configuring alerts, and reports
Strong knowledge and experience with Linux System Engg and scripting languages utilizing solid coding practices (code re-use, functions, comments) Python, Perl and Shell
Strong development/support experience with Java, Kotlin, or Swift
Experience in Development and maintenance of iOS and Android apps
Experience on integration and usage of Mobile APM tools like Fabric, Sentre, MixPanel, App Dynamics etc. to analyze mobile app crashes preferred
Deployment and troubleshooting experience on JBOSS and Node JS
Self-motivated with a strong sense of urgency and dedication to deadlines
Experience in Reliability space and tools
Experience in building dashboard and tools
Experience with Red Hat OpenShift, Kubernetes and Docker
Experience working with Jenkins and any open source CICD tools, network load balancers such as Big IP f5 and design/development of iRules.
Experience on modern databases (Redis, Couchbase ..)
Employment eligibility to work in American Express in the U.S. is required as the company will not pursue visa sponsorship for these positions.