GPU Computing Systems Specialist

Nvidia Durham , NC 27701

Posted 2 months ago

We are now seeking a GPU Computing Systems Specialist.

Would you be thrilled to work with the most cutting-edge hardware and applications for deep learning in the world? Do you have the skills to run a diverse computing cluster filled with the latest NVIDIA GPUs? NVIDIA's Deep Learning Architecture and Libraries group is looking for a world-class systems specialist to run and grow our internal development cluster, the core infrastructure that software developers and GPU architects rely on for every stage of our product development. Our mission, which spans both hardware and software, is to consistently deliver the world's fastest deep learning technology stack for applications ranging from autonomous vehicles to training enormous models on supercomputers.

Your work will enable engineers to work efficiently with a wide variety of systems as they vigilantly seek out opportunities for performance optimization and continuously deliver high quality software. As a member of our team, you will need to be versatile enough to wear many hats: systems specialist, system administrator, and software engineer. Your work will enable the ground breaking experimentation that allows us to design the world's most powerful systems for the most demanding computing applications. You will have a meaningful impact at a fast-moving company that is spearheading the next wave in computing technology. Join our technically diverse team of GPU architects, software engineers and infrastructure experts to unlock unprecedented deep learning performance in every domain!

What you'll be doing:

  • Administer a diverse GPU computing cluster containing production and pre-production GPUs

  • Use modern cluster management tools to configure and monitor the nodes and network

  • Develop scripts, tools, and distributed systems to automate cluster management tasks and simplify usage

  • Assist users with experiment and application setup using a variety of development, performance analysis, and hardware configuration tools

  • Work closely with multiple teams to identify new infrastructure and software requirements

  • Influence methodologies for cluster usage and testing of tools and workflows

What we need to see:

  • BA, BS, or MS in relevant field (e.g. CS, EE, CE)

  • At least 2 years of experience deploying and administering Linux clusters, with at least 5 years of relevant industry experience.

  • Deep understanding of operating systems, containers, computer networks, and high performance applications

  • Experience with modern DevOps tools (Docker, Gitlab, SaltStack or similar)

  • Background with HPC job schedulers (SLURM or similar)

Ways To Stand Out From The Crowd:

  • Familiarity with GPU computing, HPC, and parallel programming (CUDA, MPI, OpenMP)

  • Experience working with deep learning frameworks like Caffe, TensorFlow, and Torch

  • Strong programming skills in Python (or similar) and C++ (or similar)

NVIDIA is widely considered to be one of technology's desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. Does the idea of contributing to and pushing the boundaries of state-of-the-art AI and Compute systems excite you? If so, we want to hear from you!

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression , sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
HCS Application Systems Analyst Senior Beacon IT Clinical Systems

UNC Healthcare

Posted 4 days ago

VIEW JOBS 11/30/2020 12:00:00 AM 2021-02-28T00:00 Description This position may involve support of various hospitals and health care systems within the UNC Health Care System, but will be employed by Rex Hospital, Inc. (this includes, but is not limited to, for purposes of payroll, health benefits, retirement options, and applicable policies) Provides a high level of technical support (development, testing, integration and implementation) of applications and interfaces which deliver IT tools and support business functions across the health system. Performs project management, requirements definitions, systems design, analysis, code development, problem resolution, and vendor coordination. Qualifications Education Requirements: Bachelor's degree in Computer Science, Information Systems Management or related field (or equivalent combination of education, training and experience). Licensure/Certification Requirements: No licensure or certification required. Professional Experience Requirements: If a Bachelor's degree: Four (4) years of direct experience. If an Associate's degree: Eight (8) years of direct experience. If a High School diploma or GED: Twelve (12) years of direct experience. Knowledge, Skills, and Abilities Requirements: Excellent analytical and communication skills. Ability to work well in a team environment. Demonstrated ability to successfully manage multiple tasks simultaneously. Highly responsive to internal customers. UNC Healthcare Durham NC

GPU Computing Systems Specialist

Nvidia