Regular Full-time, Pay Grade 44
The Joint Institute for Computational Sciences (JICS), a joint institute between the University of Tennessee (UT) and Oak Ridge National Laboratory, is seeking qualified applicants for a Research Associate III position. The successful candidate will be a member of the High-Performance Computing Operations Group of the National Institute of Computational Sciences (NICS) and reports directly to the Group Leader of HPC Operations.
The successful candidate will play a key role in evaluating, deploying, maintaining, and leading system administration tasks/projects at JICS.
Responsibilities and duties include, but are not limited to:
Playing a key role in improving the security, performance, and reliability of the computing infrastructure of the JICS HPC Facility
Installing, configuring, and maintaining computer system software
Diagnosing system operational problems quickly and effectively
Coordinating with vendors to resolve hardware and software problems
Porting and writing system management tools
Ensure that appropriate security measures are built into the infrastructure of JICS in accordance with the JICS security plan, that these measures meet the performance needs of the center and that they are maintained in future growth of the center.
Coordinate with JICS CISO to harden and monitor systems against vulnerabilities.
Automate system monitoring using tools and integrate them with center's configuration management tool as necessary
Documenting system administration procedures for routine and complex tasks Qualifications Required:
Masters degree in Computer Science or closely related field, or equivalent experience and training
Strong understanding of system administration issues, technologies, and best practices
At least five years of in depth administration with Linux or Unix based systems
Strong working knowledge of DNS, LDAP, NFS, DHCP, RT, Apache, Postfix, Kickstart, and TFTPboot
5 years of experience in advanced Linux system administration
Experience with high performance computers, parallel file systems, and associated high-performance networks
Strong working knowledge of configuration management software such as Ansible
Experience with NFS root cluster management technologies and paradigms
Strong knowledge of virtualization technologies such as KVM or VMware -Experience writing significant scripts with Perl, Python, Bash, sh, or similar scripting languages
Other requirements include:
Strong interpersonal skills and ability to work as a team player
Strong verbal and written communication skills
Proactive and solution-oriented problem solver
University Of Tennessee