Site Reliability Engineering Infrastructure Architect

Sage Intacct San Jose , CA 95111

Posted 2 months ago

Are you interested in working at a fast-growing Silicon Valley company voted Best Place to Work ten years in a row and recognized as one of the Best Workplaces for Diversity by FORTUNE?

Every business on the face of Earth must, in some way, do bookkeeping, accounting, and financial planning to operate. At the outset, these functions may seem like mundane facts-of-life in the process of running a business; however, the skill with which a company does them can have a profound impact on their business.

Within the Medium Segment Native Cloud Solutions at Sage, our team helps keep our public and private cloud-based infrastructure and SaaS application highly available and scalable.

The team you will join has broad expertise in Systems Engineering, Cloud Infrastructure Management (public and private), networking, and monitoring. You will help to develop, extend and maintain our mission-critical infrastructure while ensuring reliability and performance.

This role works in partnership with cross-functional teams maintaining our existing and forthcoming technology stack.


  • Will be hands-on and coding(Iac) as a high-level engineer and architect would do

  • Accelerate transformational change to our infrastructure as we transition to a native-cloud platform.

  • Participate and contribute to all Architecture related work

  • Take full solution ownership of Sage Intacct Infrastructure Design

  • Work with diverse global teams in multiple time zones

  • Coach Operations Team (Architecture Decisions, Develop Implementation/Operational Docs)

  • Research and implement new technological subsystems to modernize our infrastructure and work with various groups to maintain our high uptimes and deliverables to various business partners.

  • Implement automation and industry best practices to run our large-scale, rapidly growing infrastructure with minimum human intervention.

  • Address production issues, learn to mitigate them quickly and find ways to prevent them

  • Implement monitoring, observability, and alerting tools such as dashboards and logging systems to understand the health and availability of our infrastructure and applications.

  • Configure and maintain software components, i.e., operating systems, web servers, application environments through Python and Bash in a highly customized environment.


  • 12+ years of professional experience in working with highly available SaaS environment in a medium to larger enterprise that includes a minimum of 4 years of hands-on experience managing hybrid infrastructure in a senior role.

  • Bachelor's degree in a work-related field/discipline from an accredited college or university or equivalent work experience.

  • Strong understanding of application architecture, Database, Networking, and Security

  • Proven knowledge of resilient design patterns- Redundancy, autoscaling, health checks, failover strategies, avoidance of cascading failures, operational isolation, etc.

  • Experience with continuous integration/deployment tools and best practices in DevOps

  • Fluency in Linux administration in either Redhat or Debian distributions.

  • Experience in Infrastructure as Code (IaC) development using Terraform and Ansible is a must

  • Experience designing and deploying cloud-native enterprise applications in public or private cloud platforms (e.g., AWS, Azure, GCP, OpenShift / K8s + Containerization) is a solid plus.

  • The ability to see the "big picture" and deal with ambiguity

  • Excellent interpersonal and communication skills with the ability to work in a dynamic, high-growth environment.


icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Site Reliability Engineering Intern (Remote Summer 2022)


Posted 2 months ago

VIEW JOBS 9/16/2021 12:00:00 AM 2021-12-15T00:00 Join us as we pursue our disruptive new vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we're committed to our work, customers, having fun and most importantly to each other's success. Learn more about Splunk careers and how you can become a part of our journey! Role: Splunk is looking for highly motivated college students to join our team. As an intern, you will work on a real project (or a few) and have an opportunity to enjoy our dynamic, startup-like environment. You will experience Splunking and what defines our culture while honing the skills which separate our development teams from others. Working to support internal and external customer needs, you will collaborate with multi-functional teams, receive mentorship, and gain insight into our values-driven process. Our goal is both to support your growth and development while empowering you for a successful start to your career. Responsibilities: As a Site Reliability Engineer Intern you will be responsible for … Building innovative solutions for our next generation of our large-scale Cloud offering. You will get to work with a super smart bunch of folks who are working on robust, resilient, and auto-scaling platform solutions for hosting Splunk's enterprise software. The focus is always on automation, solving complex challenges that span across multiple groups within Splunk, ensuring smooth and expedient services to Splunk users. You will be working on the core compute platforms and hosting infrastructure within the Cloud. * You will design, develop, and test software systems * You'll actively contribute through participation in agile development of project timelines, implementation design specifications, system flow diagrams, documentation, testing, and ongoing support of systems * Your voice will have an impact through your recommended modifications to processes and procedures, and directly contribute to standard methodologies, architecture, and implementations * We will encourage you to live innovation by promoting and soliciting ideas within project teams Requirements: Minimum Qualifications: * Possess knowledge of software engineering processes, agile framework, tools (e.g.: programming proficiency in a language, preferably Go, C++, Java, Python, etc), methods, test development, algorithms and data structure * Experience in systems programming (network stack, file system, OS services) and networking (L2 vs. L3, network architecture, VLANs, etc) * Experience in scripting in/for a Linux or Unix environment. * You are passionate about learning new technical ecosystems and contributing to building and running distributed systems at scale in production * Interested in working with container deployment and orchestration technologies at scale with familiarity of the fundamentals to include service discovery, deployments, monitoring, scheduling, load balancing * Interested in identifying performance bottlenecks, identifying anomalous system behavior, and determining the root cause of incident * Eager to effectively work collaboratively across functions in a fast-paced environment * You are enthusiastic about making the many users of your product happier * You enjoy working well with others in a fast-paced environment * You enjoy working within an agile environment * Strong communication skills, verbal and written * You bring enthusiasm for solving interesting problems Preferred Qualifications: * Knowledge of Kubernetes and Docker * Experience with development and deployment in a hosted cloud environment, preferably AWS Education: Actively pursuing a Bachelor's, Master's, or PhD in Computer Science, Software Engineering, Computer Engineering, Electrical Engineering, Mathematics or a related technical field, and strong record of academic achievement. What We Offer You: * The opportunity to work with a set of extraordinarily talented and dedicated peers, all the way from engineering and QA to product management and customer support. Splunk flourishes with disruption and diversity * A constant stream of new things for you to learn and an opportunity for growth and mentorship. We believe in growing engineers through ownership and leadership opportunities. We also believe mentors help both sides of the equation. * A stable, collaborative and supportive work environment We value diversity at our company. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or any other applicable legally protected characteristics in the location in which the candidate is applying. For job positions in San Francisco, CA, and other locations where required, we will consider for employment qualified applicants with arrest and conviction records. Splunk San Jose CA

Site Reliability Engineering Infrastructure Architect

Sage Intacct