Principal DevOps/Infrastructure Engineer
Life360 is a platform for today's busy families, bringing them closer together by helping them better sync, communicate with and protect the people they care about most.
Our mobile app provides millions of families in over 140 countries with services such as private location sharing, location history, drive details, crash detection, roadside assistance and help alerts through our free and paid membership subscription.
Founded in 2008, Life360 is based in San Francisco with offices in San Diego, Las Vegas and Ft. Lauderdale.
Life360 has raised +$100M from investors such as Bessemer Venture Partners, DCM, Fontinalis Partners, BMW iVentures, Allstate, Bullpen Capital, Founders Fund (FF Angel), Launch Capital, Kapor Capital, and 500 startups.
For more information, visit us at life360.com
About the Job
The Principal DevOps/Infrastructure Engineer is responsible for building and supporting Development, Deployment, Delivery, Operations & Monitoring capabilities in delivering Cloud based solutions to our customers. This senior team member will drive the technical implementation of DevOps framework across multiple products and services lines to achieve a CI/CD/CT/CS paradigm. The Principal Engineer will have experience working in highly complex projects and will drive the implementation of Cloud DevOps solutions framework to support Cloud Platform SLAs. The Principal DevOps Engineer as a member of the DevOps/Infrastructure team will provide architectural oversight, guidance, directions to support and drive enhancements to our Cloud platform infrastructure. This position will be responsible for delivering and maintaining cloud initiatives from inception to production of all client and other applications solving complex technical issues, which are critical to our businesses such as distributed systems development, cloud native transformation.
At a pivotal shift in the role of infrastructure engineering at Life360 you will both be responsible for the company's production systems as well as for building/bringing the tools which will enable product engineering teams to own their own production services.
Existing production responsibilities include: bringing up new services, taking down old services, capacity planning for growth, and a 1 week on 3 off on-call rotation.
Supporting dozens of services on hundreds of EC2 instances across multiple stacks. Technologies include EC2, RDS, MySQL, PHP, Java, Akka, Go, Python, Cassandra, HAProxy, Consul, Chef, Terraform.
Build or bring in OSS/3rd party tools to enable engineering teams to provision, deploy, monitor, support and remediate their services in development and production.
Responsible for architecting and building out and migrating to our next generation immutable kubernetes infrastructure.
One last thing, do all this while wearing 5+ billion requests per day like it's a breeze, traffic doesn't make you nervous.
BS or MS in Computer Science, Engineering, Physics, Math, or related fields or equivalent work experience
15+ years minimum years' experience of product development support in the global setting
10+ years of experience as an established technical leader of DevOps or Build & Release teams, with proven track record of successfully delivering services
10+ years of experience in Linux/Unix environments
7+ years leading and mentoring technical resources
AWS certification required
Ability to grok complex multi tier production systems at scale. Hell, you even love and thrive in this level of complexity
You write software - Go, Python, PHP, Java, etc. You know what unit tests are and you like them. You know what encapsulation is and you strive for it.
Additional Strong Unix command-line fluency. You consider at least one of the following to be a good friend or yours: awk, grep, strace, netstat, /proc
Experience with multiple major config-management tools, and an opinion about their strengths and weaknesses. (We use Chef. Prior experience with related tools is cool, too.)
Experience with multiple monitoring and metrics collection platform. (We use Prometheus and Grafana. Prior experience with Graphite, StatsD, CollectD, Sensu, Nagios, Icinga are great.)
Experience with AWS services including EC2, S3, and ELB, RDS and how they work together (GCM or Azure equivalents are just fine)
Ability to jump into unfamiliar systems as root and diagnose host behavior without doing harm.
Ability to follow well-known patterns with consistent naming schemes that anyone can follow.
Communications-first mindset; everyone on #channel knows what you're doing, while/why you're doing it, and when you're done (especially during a production incident).
Deep experience supporting at least one of the following subject matter areas databases (MySQL, Cassandra), caching (Redis, Memcache), HTTP services (tomcat, nginx, apache), compute services (Java, Go, Python, Erlang, Akka, PHP), queueing/streaming (NSQ, RabbitMQ, Kinesis, Kafka), Service Mesh/Discovery(consul), Build Systems (bamboo, Circle CI, Travis CI, Jenkins)
Have caused, and fixed, at least one major production incident, and have the scars to tell about it.
Some things we'd love to see as well:
Experience with actor systems
Experience with columnar databases such as Cassandra
Knowledge of JVM tuning
Continuous Integration experience with tools such as Atlassian Bamboo, Travis CI, Circle CI, TeamCity, Jenkins.
Experience in or migrating to environments where engineering owns production
Experience building or maintaining multi-datacenter could applications
Fridays are Work From Home days at Life360
Competitive pay and benefits
Free snacks, drinks (three ways to brew your favorite cup of coffee), and food in the office
Catered lunches throughout the week
Health, dental and vision insurance plans
$200/month Quality of Life perk
A great office with plenty of light in the heart of the SOMA district in beautiful San Francisco
Whatever makes you stronger makes us stronger. We buy you the things you need to improve yourself and get your job done.