Cybercoders Fremont , CA 94537
We're seeking to hire a Systems Engineer to help design and build systems, ranging from stand-alone workstations and servers to multi-rack CPU, GPU and storage clusters. The position is complex, it requires a mix of pre-sales system hardware knowledge, as well as possibly hands-on server/cluster build experience which would help the pre-sales portion for more complex sales cycles/requests.
If you are a Computer/Server Systems Engineer with experience, please read on!
What You Will Be Doing
-Design of systems and clusters (including computational and storage clusters) from component level.
Including network fabric, switching, power infrastructure, cabling, etc.
Assembly of systems (from Workstation to blade systems) including clusters when necessary
Inspect and perform quality control on outgoing finished-goods
Help provide front-line technical support for hardware via phone & email
-Configuration of system OS software, Linux & Windows clustering of assembled cluster systems
Maintenance of internal equipment (internal clusters, Workstation OS Updates, Domain Controller, Etc)
Help provide front-line technical support for software via phone & email
What You Need for this Position
More Than 5 Years of experience and knowledge of:
10+ years Windows desktop and server support experience in local and global sites.
Solid working knowledge and troubleshooting skills with x86 server, workstation and printers.
Comfortable with remote management tools in RHEL/CentOS, SUSE and Windows environments.
Familiar with Virtualization and cloud technologies including VMWare, KVM, Hyper-V and Openstack.
Designed, deployed and tuning multiple HPC clusters to meet production schedule in timely fashion.
Demonstrated self-learning ability to adopt new technologies quickly, Ex. Familiar with Docker in 3 days.
10+ years Windows server (Windows 2000 to 2012 R2) implementation and supporting experience
Windows domain server set up and migrated from 2000/2003 to 2008 and later to 2012 R2
Exchange server 2003 set up and migrated it to Exchange 2010 and 2013 HA
Set up and manage windows features like Group police, DNS, File permission, print service, FTP, Hyper-V, PowerShell, WSUS and SCCM
Level-3 desktop support experience with Windows XP, Win7&8 and Window10
10 years server hardware troubleshooting and performance tuning experience
Comprehensive understanding of latest and previous x86 server hardware from Xeon 5400/5500/5600 to Sandy bridge/Ivy bridge/Haswell/SkyLake crossing all E7/E5/E3 architectures
Excellent knowledge of different cutting-edge hardware platforms from Supermicro, Intel PCSD, Tyan, Quanta, Foxconn, ASUS, Gigabyte, Wiwynn, AIC, Gooxi and etc.
Familiar with Open Compute Project hardware, like server, storage, racks and data center power and cooling
Hands-on Server validating/Benchmarking tools like memtest, iozone, Passmark and etc.
7 years RHEL 4.0-7.0/Centos 4.0-7.0/Suse 11/12 Linux implementation and support experience
Administrator experience with Linux, familiar with service including SSH, NFS, Samba, Squid, KVM, DHCP, PXE, Cobbler and etc.
Self-taught and obtained SUSE Certified Linux Administrator certification within 2 months and Puppet in two weeks
Shell programming experience with BASH and recently self-learning Python
5 years Virtualization enterprise level experience
Design and implement servers with VMWare ESX, Microsoft Hyper-V, Linux KVM/Docker, and Openstack
Real world 30+ HPC cluster projects design/deploy experience, some as big as 7 racks and hundreds of nodes
Based on customers application needs, designed unique CPU/GPU HPC cluster for them. Customer base including Jefferson Lab, Argonne Lab, VMWare, Western-Geco, Stanford, Caltech, SLAC, CSUN, Childrens hospital, Hitachi, HuaWei,,Lincoln Financial, Microsoft, Synapse technology, Whamcloud, LandOcean,, China University of Petroleum, Tongji University, Sensetime and etc.
Lead a team from design, installation, validating to deploying clusters in tight schedule, meanwhile coordinating with multiple teams to archive final goals; also initialized documentations including user manual/quick start guide and provided user trainings
Familiar with HPC cluster software like Rocks, Bright computing, Platform HPC, SGE, torque, LSF, MPIs, MKL/Goto and Intel/GNU compilers, CUDA libraries, Caffe, Intel/Cloudera Hadoop Map/Reduce and Spark
Hands-on HPC performance troubleshooting/validating tools like IMB, ICR, Linpack, Stream, HPCC, NAS Parallel benchmarks and etc.
Comprehensive understanding of HPC cluster network technologies including Infiniband and 10/40G Ethernet
Familiar with OFED driver installation, firmware updating and IB related commands like ibstat, ibswitch, ibhosts, ibcheckerrors and others in QDR/FDR and EDR technologies
6 years deployment experience with HPC accelerators including NVidia Tesla GPU and Intel Xeon Phi cards
Installation and setup experience with all generation Tesla and Phi and recently working Titan X with NVidia Digits and Caffe for deep learning and GRID 2.0 for VDI GPU solutions
Self-learned CUDA programming in 2010 and passed NVidia CUDA programmer exam in 2 months
Familiar with NVidia and Intel Xeon Phi cluster level monitoring, troubleshooting and tuning toolsets
5 years hands on experience with storages
Including both white box and branded NAS/iSCSI/SAN and LSI SAS switch; also with software like Windows file server, NFS, OpenE, FreeNAS, Hadoop, Nexenta, Gluster, and Lustre. Recently interested in Ceph
So, if you are a Computer/Server Systems Engineer with experience, please apply today!
Applicants must be authorized to work in the U.S.