As a distributed systems software engineer, you’ll be working on our in-house resource orchestration system. This system coordinates state and access to hundreds (soon thousands) of GPU compute nodes in multi-tenant clusters spanning across multiple data centers. Some responsibilities of the role include:
We’re the San Francisco Compute Company. We’re building the first real-time compute trading platform. We think that over the next decade, thousands of startups and labs are going to be training and serving large models. They need compute to do this, and we’re building a platform on which that compute can be traded. If we’re successful, it will be possible to scale to tens of thousands of accelerators for hours at a time without having to build your own infrastructure. This will greatly increase the number of organizations that can afford to train large models, which will make the most important technology of our lifetime accessible to more people.
Compensation
US: $170k - $300k + equity