Data Center Test Development Architect

Nvidia Santa Clara , CA 95051

Posted 1 week ago

We are seeking a highly skilled and hard-working Senior Test Developer Architect to join our multifaceted Enterprise Software QA team. This role offers an outstanding opportunity to leave your mark on the design, construction, optimization and testing of large-scale infrastructure for various foundational NVIDIA unified cloud services and data center offerings. If you are a dedicated engineer with a deep understanding of cloud infrastructure and distributed systems, and you thrive in an exciting, innovative environment, this could be the flawless role for you.

What you'll be doing:

  • Engage with product engineering teams to gain a comprehensive understanding of their infrastructure use cases.

  • Provide mentorship to SWQA teams on effectively testing at scale. Develop end to end test plans that exercise all layers of SW stacks for NVIDIA cloud-based infrastructure Lead NVIDIA Data Center bring up activities from SWQA perspective Develop sophisticated tooling to automate the build and deployment of microservices and infrastructure components, improving efficiency and productivity.

  • Reduce manual labor and increase operational efficiency through automation. Supervise the infrastructure to alert on significant events, ensuring the highest level of system performance and reliability.

  • Work closely with partners to understand their infrastructure needs and to ensure our testing encompass their use cases.

What we need to see:

  • A Master's or Ph.D. in Computer Science or a related field, or equivalent experience.

  • 4+ years of hands-on experience in cluster management and related tools, including Docker Containers, Slurm, Kubernetes, and Ansible.

  • 8+ years strong experience with cloud infrastructure platforms like AWS, Azure, or Google Cloud.

  • Hands-on experience with server platform, network, storage, cluster configuration and debugging.

  • Experience with platform telemetry, datacenter node lifecycle management/support including CPU/GPU workloads Proficiency in scripting languages such as Python. Expertise in administering, operating, and configuring Kubernetes and Envoy.

  • Validated experience in Continuous Integration/Continuous Delivery (CI/CD) tools such as Gitlab and Jenkins and the GitOps model. Proficiency in various monitoring tools: Prometheus, Grafana, Cloudwatch, and Thanos.

  • Strong background in cloud security, Kubernetes security, and application security. Proficiency in debugging issues involving networks, DNS, HTTP, Linux, and containers.

  • Strong analytical and problem-solving skills, along with an ability to articulate what you know to others.

Ways to Stand Out from the Crowd:

A true innovator who isn't afraid to challenge the status quo and bring fresh ideas to the table. You're always looking for ways to improve existing systems and processes. Passion and curiosity about the latest technologies and trends in cloud infrastructure and distributed systems. You're not just familiar with the tools, but you understand the underlying principles and can demonstrate this knowledge to make strategic decisions. Committed to personal and professional growth. You're crafting opportunities to learn new skills and deepen your expertise.

By joining our team, you will be part of a forward-thinking company that values innovation and creativity. We offer a competitive salary and benefits package, a flexible work environment, and the opportunity to work with some of the industry leading experts. If you're ready to take your career to the next level, we'd love to hear from you.

The base salary range is 200,000 USD - 310,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.


icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove

Data Center Test Development Architect

Nvidia