Sr Reliability Engineer

Forescout San Jose , CA 95111

Posted 7 months ago

What We Are Doing:

We are providing solutions for one of the largest needs in the security space. Forescout Technologies is the leader in device visibility and control and we have pioneered an agentless approach to network security to address the explosive growth of the Internet of Things (IoT), cloud computing and operational technologies (OT). We offer a highly scalable, heterogeneous platform that provides Global 2000 enterprises and government agencies with agentless visibility and control of today's vast array of physical and virtual devices the instant they connect to the network.

What You Will Do:

The Sr. Site Reliability Engineer will be part of a Forescout's new innovative team to build out our new IaaS cloud platform. Primarily focused on tools and automation, our SRE team will work closely with engineering to design and develop reliable software solutions to achieve our mission of building a scalable, secure public cloud environment for our customers. We are looking for someone who has the passion to lead, architect, design, document and implements comprehensive platform solutions using security best practices. If you are passionate about automation this is the perfect role for you!


  • Build and run production environments in AWS using Infrastructure as code methods.

  • Manage and operate infrastructure as code for CI/CD, release management, etc.

  • Develop tools to automate process and tasks

  • Develop and maintain log aggregation

  • Develop and maintain infrastructure metrics collection

  • Ensure system compliance with various corporate requirements (for example, security).

  • Ensure uptime of the platform for end-users as well as development teams.

  • Collaborate with coworkers from design, engineering, product, project, and QA teams

  • Own automation of the infrastructure platform deployment code, and drive the development of new features and technologies

  • Provide architectural and deployment recommendations based on existing systems deployment and utilization data to improve the security and performance

  • Proactively monitor systems performance & status to help maintain the service infrastructure and prevent outages, as well as respond to failure and alert event notifications 24x7

  • Generate and maintain documentation / SOP for systems architecture and maintenance as well as for controls focused on security compliance.

  • Identify, triage, resolve and escalate issues in a timely manner

  • Perform system and network administration tasks; Create, assign or escalate incident tickets to proper entities

  • Orchestrate environment development using Kubernetes, Terraform and/or Cloud Formation

  • Build & Deploy CI pipeline automation to a combination of Kubernetes/Docker and VM based systems in both VMWare and AWS.

  • Work closely with System Architects, Engineers, Product Managers, and System Administrators to meet their environment setup and service automation needs.

  • Develop a process to make DevOps as part of the engineering development, service deployment, and operations lifecycle.

  • System monitoring & analytics; Monitor and support the systems and networks using proprietary and third-party tools

What you bring to Forescout:

  • Minimum 7 years of programming experience in SRE, DevOps, or similar role working in an enterprise on hosting complex systems (multi-service systems) in AWS (Linux) in a highly available and scalable way (Required)

  • Masters or Bachelors Degree in Computer Science

  • Working experience with AWS: VPC, EC2, RDS, Lambdas, Cloud Formation or Terraform (Required)

  • CI/CD experience (automate build, packaging, test pipelines and deployments with Jenkins, git) (Required)

  • Solid understanding of end-to-end technology stacks which include but is not limited to OS, Network, Application, Relational & Nonrelation Databases, interacting with APIs and Security (network & application)

  • Design thinking, Architecture, and Solution Design; Provide expert technical leadership to customers and partners regarding all aspects of IaaS and PaaS platform

  • System build configuration automation using one of Chef, Ansible, Puppet.

  • Production system integration, log collection, and analysis build and performance monitoring/tuning.

  • Should know how to deploy these services is AWS in an automated way and be able to manage and support these systems as SRE as well as be able to look at the code and fix issues

  • Ability to work independently as well as with a team OR (Strong ability to communicate and collaborate with others on multi-disciplinary projects.)

  • Self-disciplined with strong attention to detail

  • Excellent written and oral communications OR (The ability to work collaboratively, to communicate technical concepts clearly and accurately)

  • Energetic and self-starting and must be passionate about automation

  • Knowledge and experience with running customer-facing, a cloud-based, production software system

  • Proficiency in high-level languages such as Python (preferred), Ruby or Java and working on software projects in a collaborative environment such as Bitbucket or Git.

  • Prior experience with AWS or Google or Azure cloud infrastructure automation and dev-ops workflows

  • Demonstrates problem-solving skills through engineering solutions and open source tools

What Forescout Offers You:

Strong product, good leadership, great culture, good people, diverse, great benefits, great compensation. If you have a good work ethic, are visible, lean in, you will be recognized. We are in growth mode and there are tons of opportunity. A positive attitude and being flexible to change goes a long way here at Forescout!

  • Competitive compensation and Benefits

  • Collaborative and innovative environment make an impact on worldwide security while working on the hottest technology.

  • We work hardand we PLAY hard!


icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Reliability Electrical Engineer / Principal Reliability Electrical Engineer

Northrop Grumman

Posted 1 week ago

VIEW JOBS 2/13/2020 12:00:00 AM 2020-05-13T00:00 At Northrop Grumman we develop cutting-edge technology that preserves freedom and advances human discovery. Our pioneering and inventive spirit has enabled us to be at the forefront of many technological advancements in our nation's history - from the first flight across the Atlantic Ocean, to stealth bombers, to landing on the moon. We continue to innovate with developments from launching the first commercial flight to space, to discovering the early beginnings of the universe. Our employees are not only part of history, they're making history. The Engineering & Sciences (E&S) organization pushes the boundaries of innovation, redefines engineering capabilities, and drives advances in various sciences. Our team is chartered with providing the skills, innovative technologies to develop, design, produce and sustain optimized product lines across the sector while providing a decisive advantage to the warfighter. Come be a part of our mission! Northrop Grumman Mission Systems (NGMS) is looking for you to join our team as a Reliability Electrical Engineer or a Principal Reliability Electrical Engineer based out of San Jose, CA. What you'll get to do: This position will perform reliability analysis of analog, RF, power, mixed signal CCAs and modules. The right candidate will have the skills to apply advanced technical principles, theories, and reliability concepts to complex problems. The work contributes to the development and sustainment of new Next Generation Systems. This position works under guidance and direction toward predetermined goals and objectives. Assignments can be self-initiated, and the individual will determine and pursue courses of action necessary to obtain desired results seeking guidance from more senior staff members. The work is checked through consultation and agreement with others rather than by formal review of supervisor. Roles & Responsibilities: * Develops, coordinates and conducts technical reliability studies and evaluations of engineering design concepts and design of experiments (DOE) constructs. * Develops reliability predictions for next generation CCAs and systems using MIL-HDBK-217 and Telcordia SR-332 * As necessary, proposes changes in design or formulation to improve system and/or process reliability. * Performs reliability growth planning, tracking and projecting and designs test methods for achieving required levels of product reliability and maintainability. * Compiles and analyzes performance reports and process control statistics; investigates and analyzes relevant variables potentially affecting product and processes. * Guides FRB activities and associated RCCA analyses toward closure. * Ensures that corrective measures meet acceptable reliability standards. Analyzes preliminary plans and develops reliability engineering programs to achieve company, customer and governmental agency reliability objectives. * May develop mathematical models to identify units, batches or processes posing excessive failure risks. * Prepares materials for and supports the various programs and technical meetings. * Supports and maintains the integrity of the NGMS Logistics team while supporting the customer. Works collaboratively on multidisciplinary engineering team(s). This requisition may be filled at a higher grade based on qualifications listed below. This job requisition may be filled as a level 2 or 3 based on the qualifications below: Basic Qualifications for a Reliability Engineer (Level 2): * Bachelor's Degree in a technical discipline and 2 years of experience in the field of electrical engineering, or Master's Degree in a technical discipline and 0 years' of experience in the technical field of electrical engineering * 1 years of experience performing reliability predictions and modeling, FMECA, and reliability growth analyses for the design/manufacturing of electrical circuits and systems. * Must be a US Citizen and have the capability to receive and hold a Top Secret/SCI or Special Access clearance Basic Qualifications for a Principal Reliability Engineer (Level 3): * Bachelor's Degree in a technical discipline and 5 years of experience in the field of electrical engineering, or Master's Degree in a technical discipline and 3 years of experience in the field of electrical engineering * 1 years of experience performing reliability predictions and modeling, FMECA, and reliability growth analyses for the design/manufacturing of electrical circuits and systems. * Must be a US Citizen and have the capability to receive and hold a Top Secret/SCI or Special Access clearance Preferred Qualifications: * Bachelors in Electrical Engineering or STEM degree preferred (Science, Technology, Engineering, Math) * Experience with Relex/ WQS, Reliasoft, or equivalent for reliability predictions and modeling * Understanding of MIL-STD-1388, Logistics Support Analysis, and the intersection of reliability and logistics * Understanding of component engineering * Working user knowledge of PTC Windchill or Siemens TeamCenter PLM, or equivalent * Good oral and excellent written communications skills * High degree of PC skills and software such as Word, Excel, Visio, Project, and other MS office products * Experience design, development, testing of high voltage power circuits (IEC61010 or UL1262 or equivalent) * Currently hold Top Secret/SCI or Special Access clearance with current poly What We Can Offer You: Northrop Grumman provides a comprehensive benefits package and a work environment that encourages your growth and supports the mutual success of our people and our company. Northrop Grumman benefits give you the flexibility and control to choose the benefits that make the most sense for you and your family. Your benefits will include the following: Health Plan Savings Plan Paid Time Off Education Assistance Training and Development Flexible Work Arrangements Additional Northrop Grumman Information: Northrop Grumman has approximately 85,000 employees in all 50 states and in more than 25 countries, we strive to attract and retain the best employees by providing an inclusive work environment wherein employees are receptive to diverse ideas, perspectives and talents to help solve our toughest customer challenges: to develop and maintain some of the most technically sophisticated products, programs and services in the world. Our Values. The women and men of Northrop Grumman Corporation are guided by Our Values. They describe our company as we want it to be. We want our decisions and actions to demonstrate these Values. We believe that putting Our Values into practice creates long-term benefits for shareholders, customers, employees, suppliers, and the communities we serve. Our Responsibility. At Northrop Grumman, we are committed to maintaining the highest of ethical standards, embracing diversity and inclusion, protecting the environment, and striving to be an ideal corporate citizen in the community and in the world. Northrop Grumman San Jose CA

Sr Reliability Engineer