Site Reliability Engineer - Cloud Infrastructure (Remote OK)

Splunk San Jose , CA 95111

Posted 2 months ago

Site Reliability Engineer - Cloud Infrastructure

Join us as we pursue our disruptive new vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people who are excited about our product and seek to deliver the best experience for our customers.

At Splunk, we're committed to our work, customers, having fun and most importantly to each other's success. Learn more about Splunk careers and how you can become a part of our journey!

What we're looking for

Are you interested in being part of a small team tasked to build the next-generation industry-leading platform for machine data? Splunk's Cloud group is looking for a Site Reliability Engineer to help lead, design and build the future of our large scale Cloud offering. You will be working on the core compute platforms and infrastructure in support of Splunk's cloud offering.

We are an engineering- and product-focused company. Our engineers take a leading role in designing, architecting, building and testing our product.

We have a substantial AWS presence of large-scale containerized systems. This is an incredible opportunity to utilize your existing cloud experience and help drive the growth of the Splunk Cloud.

What you provide

  • Cloud and container experience. Building, scaling, monitoring and troubleshooting of services on different cloud providers is a must. You will use AWS, Vault, Terraform, Kubernetes and Docker.

  • Distributed programming. Experience in working on distributed systems like databases, distributed file systems, distributed concurrency control, consistency models, CAP theorem is an added plus.

  • Desire to learn and adapt. Our agile team has a lot of projects going on at once, and you'll have the opportunity to learn to navigate the code and features. You'll constantly be learning new areas and new technologies.

  • Passion. Our customers are passionate about Splunk, and we want the same from our engineers. We want you to actively own your work and be excited about your projects.

  • Drive for automation. You constantly consider, "How can I automate this manual process?"

  • Knowledge of technical excellence. You know continuous delivery, testing, security practices, performance, and disaster recovery.

  • Operational excellence. Data excites you and you make decisions based on numbers rather than assumptions. If an issue arises, you strive to be alerted before our customers notice.

  • Data structures and algorithms. A solid grasp of data structures, algorithms, and RESTful APIs.

  • Ability to work with multiple programming languages. We have code in several languages, ranging from Go to Python. In this position you'll be mostly using Go.

What we provide

  • Opportunities to develop and grow as an engineer. We are always expanding into new areas, working with open-source projects and contributing back, and exploring new technologies.

  • A team of incredibly capable and dedicated peers, all the way from engineering to product management and customer support.

  • Breadth and depth. Are you interested in working on distributed systems that dynamically scale to meet the need of Splunk's cloud offering?

    We have that. Do you want to go deep into optimizing how we automate every manual process and tedious task we encounter? We have that too, and more.

  • Growth and mentorship. We believe in growing engineers through ownership and leadership opportunities. We also believe that mentors help on both sides of the equation.

  • A stable, collaborative, and supportive work environment. We work in an open environment, work together to get things done and adapt to the changing needs for the team.

    We keep it real by being open and honest. We are a collaborative team that understands the value in open communication.

  • Balance. We don't expect people to work 12-hour days.

    We want you to be successful outside of work too. We trust our colleagues to be responsible with their time and commitment and believe that balance helps cultivate a positive environment.

  • Fun. We have frequent group outings and team building events when in person and do our best to keep up the fun with remote events while we are out of the office as well. We are committed to having every employee want to give it their all, be respectful, feel like a part of the family and have a smile on their face while doing it.

We value diversity at our company. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or any other applicable legally protected characteristics in the location in which the candidate is applying.

For job positions in San Francisco, CA, and other locations where required, we will consider for employment qualified applicants with arrest and conviction records.



icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Site Reliability Engineering Infrastructure Architect

Sage Intacct

Posted 1 week ago

VIEW JOBS 9/17/2021 12:00:00 AM 2021-12-16T00:00 Are you interested in working at a fast-growing Silicon Valley company voted Best Place to Work ten years in a row and recognized as one of the Best Workplaces for Diversity by FORTUNE? Every business on the face of Earth must, in some way, do bookkeeping, accounting, and financial planning to operate. At the outset, these functions may seem like mundane facts-of-life in the process of running a business; however, the skill with which a company does them can have a profound impact on their business. Within the Medium Segment Native Cloud Solutions at Sage, our team helps keep our public and private cloud-based infrastructure and SaaS application highly available and scalable. The team you will join has broad expertise in Systems Engineering, Cloud Infrastructure Management (public and private), networking, and monitoring. You will help to develop, extend and maintain our mission-critical infrastructure while ensuring reliability and performance. This role works in partnership with cross-functional teams maintaining our existing and forthcoming technology stack. Responsibilities: * Will be hands-on and coding(Iac) as a high-level engineer and architect would do * Accelerate transformational change to our infrastructure as we transition to a native-cloud platform. * Participate and contribute to all Architecture related work * Take full solution ownership of Sage Intacct Infrastructure Design * Work with diverse global teams in multiple time zones * Coach Operations Team (Architecture Decisions, Develop Implementation/Operational Docs) * Research and implement new technological subsystems to modernize our infrastructure and work with various groups to maintain our high uptimes and deliverables to various business partners. * Implement automation and industry best practices to run our large-scale, rapidly growing infrastructure with minimum human intervention. * Address production issues, learn to mitigate them quickly and find ways to prevent them * Implement monitoring, observability, and alerting tools such as dashboards and logging systems to understand the health and availability of our infrastructure and applications. * Configure and maintain software components, i.e., operating systems, web servers, application environments through Python and Bash in a highly customized environment. Requirements: * 12+ years of professional experience in working with highly available SaaS environment in a medium to larger enterprise that includes a minimum of 4 years of hands-on experience managing hybrid infrastructure in a senior role. * Bachelor's degree in a work-related field/discipline from an accredited college or university or equivalent work experience. * Strong understanding of application architecture, Database, Networking, and Security * Proven knowledge of resilient design patterns- Redundancy, autoscaling, health checks, failover strategies, avoidance of cascading failures, operational isolation, etc. * Experience with continuous integration/deployment tools and best practices in DevOps * Fluency in Linux administration in either Redhat or Debian distributions. * Experience in Infrastructure as Code (IaC) development using Terraform and Ansible is a must * Experience designing and deploying cloud-native enterprise applications in public or private cloud platforms (e.g., AWS, Azure, GCP, OpenShift / K8s + Containerization) is a solid plus. * The ability to see the "big picture" and deal with ambiguity * Excellent interpersonal and communication skills with the ability to work in a dynamic, high-growth environment. #LI-LH1 Sage Intacct San Jose CA

Site Reliability Engineer - Cloud Infrastructure (Remote OK)

Splunk