Anthropic NY | Seattle , WA 98932
Posted 1 week ago
About the role:
Our Inference team builds the service that generates outputs from our models in production. This service is the key driver of our efficiency, latency and reliability. As an engineer on this team, you'll work on improving those metrics by solving complex distributed-systems problems across all layers of our stack.
You may be a good fit if you:
Have significant software engineering experience
Are results-oriented, with a bias towards flexibility and impact
Pick up slack, even if it goes outside your job description
Enjoy pair programming (we love to pair!)
Want to learn more about machine learning research
Care about the societal impacts of your work
Strong candidates may also have experience with:
High performance, large-scale distributed systems
Kubernetes
Python
Machine learning
Representative projects:
Improving how inference requests are routed to model servers to maximize compute efficiency
Building a performance model to predict the impact of future architecture and hardware improvements
Implementing inference for a new model architecture down to the Jax / PyTorch / Kernel layers
Analyzing observability data to tune performance based on production workloads
Implementing inference on a new hardware platform
Building instrumentation to detect and eliminate Python GIL contention
Optimizing the efficiency of our accelerator kernels
Ensuring smooth and regular deployment of inference services
Deadline to apply: None. Applications will be reviewed on a rolling basis.
Anthropic