Site Reliability Engineer IV

Verisign San Francisco , CA 94118

Posted 2 weeks ago

For more than 22 years, Verisign has maintained 100 percent operational accuracy and stability for .com and .net - managing and protecting the DNS infrastructure. Our global servers process millions of DNS queries every second. Because these services underpin the online connectivity of the Internet, our products must be absolutely correct and remain available - even during the most extreme Internet spike load events.

You can help us continue to provide these essential internet services. We are seeking a level IV engineer for our Ops Reliability Engineering (ORE) team. The Engineer IV is tasked with helping push platform and tooling evolution out to the various teams that utilize them and is viewed as a mentor to junior and mid-level engineers as well as a resource available to senior engineers to assist with validating approach and design decisions.

Level IV Engineers work and lead peers to create new systems and procedures that promote automation with reliable and reproducible results that can be used to support multiple groups within the company. The Engineer IV is a subject matter expert, which requires interaction with development, operations and other groups both within and external to the company.

General responsibilities for all levels:

  • Obtain, create and maintain future tools that will be used within this and other teams within the company

  • Maintain and update existing tools that are currently being used

  • Coordinate with other technical staff to implement systems and software

Our Benefits and Culture:

  • Work on the high-scale systems that provide key Internet services

  • Strong Verisign technical community with constant opportunities to learn via tech talks and other internal events

  • Training and education opportunities such as Verisign University and tuition reimbursement for work-related degrees

  • Clearly defined corporate mission and values

  • Vibrant internal organizations such as the Young Professionals and Women in Technology groups; with an employee base of ~900, numerous opportunities to network, learn from others, lead others, and grow your career

  • Benefits for all stages of life: employee stock purchase plan, parental leave and support programs, adoption assistance, tuition reimbursement, vacation, comprehensive insurance programs, 401K company match, nursing room, back-up child and adult/elder care support, pet insurance, wellness, commuter and more

  • Fully-equipped onsite gym offering free classes; on-site tennis, volleyball, and basketball courts, table tennis

  • Casual dress policy

  • Volunteering and matching opportunities to give back to our community via the Verisign Cares program (8 hours per quarter; up to $1,500 annually)

  • Heavily subsidized onsite multi-station cafeteria, Starbucks, smoothie and juice bar

Education and experience:

  • Four-year Bachelor's degree in Computer Science, Computer Engineering or related discipline, or equivalent work experience
  • 8 years of experience of system administration or related work

Technical experience implementing one or more of the following:

  • An IAC tool(terraform, cloudformation etc..)

  • Containerizing applications in production environment

  • Configuring applications in a CI/CD pipeline

Also important:

  • Experience administering systems in a computing environment with 500 nodes that were managed with automation tools such as Puppet, Ansible, Chef, Salt or equivalent

  • Strong experience with Linux, experience with FreeBSD or another similar Unix-based operating system

  • Advanced knowledge of scripting languages (shell, python, etc.) along with version control systems (git, svn or similar)

  • Comfortable and able to work directly with developers and other Engineering teams

Desired experience:

  • Experience with teams using Kanban and Scrum a plus

  • Experience with OpenStack in a production environment

  • Experience with container deployment orchestration tools like kubernetes, Helm, Istio etc.

icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Lead Site Reliability Engineer

Capital One

Posted 5 days ago

VIEW JOBS 3/24/2020 12:00:00 AM 2020-06-22T00:00 201 Third Street (61049), United States of America, San Francisco, California At Capital One, we're building a leading information-based technology company. Still founder-led by Chairman and Chief Executive Officer Richard Fairbank, Capital One is on a mission to help our customers succeed by bringing ingenuity, simplicity, and humanity to banking. We measure our efforts by the success our customers enjoy and the advocacy they exhibit. We are succeeding because they are succeeding. Guided by our shared values, we thrive in an environment where collaboration and openness are valued. We believe that innovation is powered by perspective and that teamwork and respect for each other lead to superior results. We elevate each other and obsess about doing the right thing. Our associates serve with humility and a deep respect for their responsibility in helping our customers achieve their goals and realize their dreams. Together, we are on a quest to change banking for good. Lead Site Reliability Engineer Site Reliability Engineering at Capital One leverages software engineering to architect, design and maintain large fault tolerant micro-services based systems in the cloud. Do you want to work for a tech company that writes its own code, develops its own software, and builds its own products? We experiment and innovate leveraging the latest technologies, engineer breakthrough customer experiences, and bring simplicity and humanity to banking. We make a difference for 65 million customers. At Capital One, you'll be part of a group of makers, breakers, doers and disruptors, who love to solve real problems and meet real customer needs. We want you to be curious and ask, "what if"? As a Capital One Lead Site Reliability Engineer, you'll work in a large-scale mission critical system where data is measure in the order of Peta Bytes, Up-time is measured with 5 nines and latency is measured in single milliseconds. You will use your software engineering expertise to constantly automate processes and innovate in a push to improve the reliability of the system. You will plan, design, build and maintain large scale engineering solutions. Whether a bug fix or an awesome feature, you will own your work and deliver the most elegant and scalable solutions. What you'll do: * Responsible for the day to day operations of Capital One's public cloud hosted infrastructure (OS + Containers + Network + Security + Monitoring) * Working closely with orchestration/API, client development teams and architecture to design and build new back end applications * Understanding of source control, branching and merge strategies * Proficiency in development, test automation, release management and infrastructure * Technical problem-solving skills * Automating operation tasks and other manual activities * Execute system administration of hosting platforms capable of running on a variety of frameworks (Python, Java, Node.js). * Effectively communicate guideline and tool usage to IT Development and other IT team members * Participate in planning discussions with development and other IT teams. * Maintain expertise in the area of architecture, including industry trends, strategies, and products to ensure that our assets are effectively and efficiently utilized * Work with developers to build out CI/CD pipelines, enable self-service build tools and reusable deployment jobs. Find, explore and advocate for new technologies for enterprise use. * Providing off-hours support on a rotating schedule * Design, document and help optimize the CI/CD strategies to reduce costs while ensuring quality * Experience with one of more of the following: Hudson, Jenkins, Maven, Sonar, GitHub * Support Capital One shared production environments (ensuring system availability, performance, capacity, and continuity through proper response to incidents, events and problems). * Insatiable desire to learn new things, as well as share your knowledge with other team members Basic Qualifications: * Bachelor's Degree * At least 7 years of experience building full-stack software solutions * At least 5 years of software development experience. * At least 5 years of experience working with Linux OS based systems. * At least 3 years of experience in Dev-Ops. * At least 2 years of experience with Scripting languages. * At least 2 years of experience using AWS. Preferred Qualifications: * Master's Degree in computer science, Information Technology. * 5+ years of experience designing and developing automated application builds and deployments. * 5+ years of experience with Linux/UNIX or at least 2 years of experience with Apache Tomcat * 3+ years of experience with the automated build and deployment management * 3+ years of experience with Continuous Integration tool sets * 2+ years of AWS experience using EC2 or at least 2 years of experience Cloud Formation or at least 2 years of experience Cloud Watch * 1+ years of experience with Ansible, Docker, Puppet or Terraform • 1+ year of Agile experience * 1+ years container development experience * 1+ years' experience with Cassandra and/or Druid. * 1+ years test architecture experience * 1+ years' experience performance tuning and monitoring tools At this time, Capital One will not sponsor a new applicant for employment authorization for this position. Capital One San Francisco CA

Site Reliability Engineer IV

Verisign