Who We Are
Today's challenging business environment is more than that - it's a period of disruption between the pandemic, global business change and internal process complexity. For us to focus on simplicity and the best customer experience, we need great talent and the right skillsets to be successful. This is now a mantra for our Cisco leadership team and for us.
The Digital Enterprise Solutions team is changing the way we run Cisco's operations by maximizing the power of technology, the best of business processes and superior data insights. Together, we will Reimagine the Cisco experience. Show the world how to Reinvent applications and leverage the future of the Internet to Showcase the power of Cisco: our people, products, processes, systems, and data.
As part of the Digital Enterprise Solutions team, our team is responsible for the infrastructure and application monitoring across Cisco leveraging Agile and DevSecOps frameworks. Our monitoring solutions provide detailed visibility into the performance, availability, and user experience of business-critical applications and the supporting infrastructure and networks. Helping IT and DevOps teams deliver consistent availability and performance to rapidly detect, diagnose and restore issues before client impact are the primary goals of these solutions.
Please join us and make this journey together!
What You'll Do
Site Reliability Engineers are responsible and take ownership for reliability, scalability, automation, and other issues related to uptime and availability of our monitoring solutions. You will need to have strong skills in following areas:
Design, write and build use cases/features to improve the reliability, availability and scalability of our Monitoring Solutions.
Augment existing instrumentation to build a cohesive picture of the characteristics of our systems with special attention to points of failure.
Design and develop improvements, focused on resilience, to our production systems to achieve and surpass SLOs
Help improve our operational practices to minimize service disruptions
Work with our Service Assurance team to modernize and improve our monitoring and alerting architecture.
Design repeatable & scalable solutions that detect failures or issues before our clients.
Conduct product proof of concepts, establish success criteria and provide recommendations
Work with engineers to identify root cause and fix issues
Influence, design and create new architectures, standards and methods for large-scale enterprise systems.
Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
Who You'll Work With
We are a diverse DevOps team supporting a mix of Cisco on Cisco solutions such as AppDynamics & ThousandEyes with strategic 3rd Party solutions to deliver the best in class Service Assurance Architecture supporting a mix of Hybrid Cloud deployment models across Cisco IT. This is a team of highly motivated individuals leveraging SAFe Agile. We thrive in rapid pace environments and are passionate about Infrastructure & Application Monitoring in Hybrid Cloud environments. Giving back to our communities and contributing to the innovative culture here at Cisco is highly encouraged. We have a history of building innovative solutions at scale or being that bridge to what's possible. We are looking for a passionate SRE who is ready to embark on a new transformational journey us.
Who You Are
You are a success driven Site Reliability Engineer with proven technical and leadership skills who has a passion for designing and implementing innovative enterprise use cases to enhance and optimize our existing monitoring solutions. You will participate in the architecture and design of the monitoring solutions, aligning with leadership, product managers and product owners along with implementation teams to support the transformation of the Service Assurance architecture across Cisco.
This is an opportunity for you to work with the best minds and monitoring solutions in Cisco IT, in a dynamic field of Infrastructure and Application Performance Monitoring in a Hybrid Cloud environment.
Experience with tool suites like Elastic, Grafana & Splunk
Experience with Infrastructure or Application Performance Monitoring Solutions & Testing experience in a diverse and complex infrastructure
ThousandEyes, AppD or similar experience a plus
Experience with building and maintaining Redhat or Centos Linux
Experience with configuration automation using Ansible
Experience with public cloud like AWS, GCP, or Azure
Experience with on-premise cloud technologies using VMware or Openstack
Experience with container technologies like Openshift, Kubernetes, and Docker
Software development lifecycle including design, development, testing, packaging, deployment, upgrade and support.
Experience with software development tools like Git, Gerrit, Spinnaker, and Jenkins
Python, Shell, Go, or similar programming experience.
QA and testing experience of your code and the entire platform.
Understanding of security including OS hardening, firewalls, iptables, and working with Infosec
Understanding of network basics like routers and switches
Leadership in building and maintaining SRE technologies
Agile software development practices
Working with geographically distributed teams
Understand lifecycle IT processes, including: architecture, design, implementation, and operations
Opensource development experience
Self-motivated, able and willing to help where help is needed
Able to build relationships, be culturally sensitive, have goal alignment, have learning agility
Typically requires BS in Engineering or Computer Science and 8+ yrs of relevant experience.
#WeAreCisco, where each person is unique, but we bring our talents to work as a team and make a difference powering an inclusive future for all.
We embrace digital, and help our customers implement change in their digital businesses. Some may think we're "old" (36 years strong) and only about hardware, but we're also a software company. And a security company. We even invented an intuitive network that adapts, predicts, learns and protects. No other company can do what we do - you can't put us in a box!
But "Digital Transformation" is an empty buzz phrase without a culture that allows for innovation, creativity, and yes, even failure (if you learn from it.)
Day to day, we focus on the give and take. We give our best, give our egos a break, and give of ourselves (because giving back is built into our DNA.) We take accountability, bold steps, and take difference to heart. Because without diversity of thought and a dedication to equality for all, there is no moving forward.
So, you have colorful hair? Don't care. Tattoos? Show off your ink. Like polka dots? That's cool. Pop culture geek? Many of us are. Passion for technology and world changing? Be you, with us!
Cisco Systems, Inc.