Principal Member Of The Technical Staff

Oracle Redwood City , CA 94063

Posted 3 weeks ago

At Oracle Cloud Infrastructure (OCI), we build the future of the cloud for Enterprises as a complementary team of fellow creators and inventors. We act with the speed and demeanor of a start-up, with the scale and customer-focus of the leading enterprise software company in the world.

Oracle Generative AI Service is an exciting team in Oracle Cloud Infrastructure. We are delivering innovative services at the intersection of artificial intelligence and cloud infrastructure. In Generative AI Service team, you will build and operate massive-scale cloud services maximizing state of art machine learning technologies. We are committed to providing the best in cloud products to meet the needs of our customers who are tackling some of the world's most challenging problems.

You will be part of a team of experienced, hands-on engineers with the expertise and passion to solve difficult problems in distributed highly available services and virtualized infrastructure. At every level, our engineers have a significant technical and business impact by designing and building innovative new systems to power our customer's business critical applications.

As a Principal member of the technical staff in Generative AI Service team, you will be leading the effort of building distributed, scalable, high-performance AI model training and serving systems in partnership with our applied scientists and software engineers. You will investigate model structure to optimize model performance and scalability. You will build state of art systems with innovative technologies in this fast evolving area.

What we offer:

  • Being part of one of the most transformational and mission-driven organizations in Oracle, collaborating with versatile peers with a diverse set of backgrounds worldwide.

  • High access to senior leadership, opportunity to make huge impacts across organizations.

  • Opportunity to build brand new technologies in large language models (LLM) and generative AI at scale to solve real business problems.

  • Close partnership with applied scientists and software engineers to deploy solutions into production in various business-critical scenarios.

About You:

  • You are an experienced machine learning engineer with a proven track record of delivering large-scale, high-performance model serving/training systems in production.

  • You are obsessed with customers and exceeding their expectations.

  • You have excellent communication skills and you can clearly explain complex technical concepts.

  • You are a engineer who understands the importance of high standards, never satisfied with mediocrity and constantly aiming for excellence.

  • You are passionate about technology and self-motivated to stay updated with latest developments in machine learning related technologies.

Minimum Qualifications

  • BS in Computer Science, or equivalent experience.

  • 5 years of demonstrated ability shipping scalable, cloud-native distributed systems

  • Ability to work in a collaborative, multi-functional team environment.

  • Proficient in Python and shell scripting tools.

  • Experience with container orchestration technologies like Kubernetes.

  • Experience with production operations and standard processes for putting quality code in production and solve issues when they arise.

  • Able to optimally communicate technical ideas verbally and in writing (technical proposals, design specs, architecture diagrams and presentations).

Preferred Qualifications

  • MS in Computer Science.

  • Production experience with cloud computing.

  • Experience with Large Language Model (LLM) serving technologies like DeepSpeed, FasterTransformer etc.

  • Experience with popular model training and serving frameworks like KServe, KubeFlow, Triton etc.

  • Experience with LLM fine-tuning, especially the latest parameter efficient fine-tuning technologies and multi-task serving technologies

  • Experience with deep learning frameworks (such as PyTorch, JAX, or TensorFlow) and deep learning architectures (especially Transformers).

  • Experience in diagnosing, fixing and resolving issues in AI model training and serving

Career Level - IC4


icon no score

See how you match
to the job

Find your dream job anywhere
with the LiveCareer app.
Mobile App Icon
Download the
LiveCareer app and find
your dream job anywhere
App Store Icon Google Play Icon
lc_ad

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Principal Member Of The Technical Staff

Oracle

Posted 3 weeks ago

VIEW JOBS 4/2/2024 12:00:00 AM 2024-07-01T00:00 At Oracle Cloud Infrastructure (OCI), we build the future of the cloud for Enterprises as a complementary team of fellow creators and inventors. We act with th Oracle Bellevue WA

Principal Member Of The Technical Staff

Oracle