The San Francisco Compute Company

By Alex Gajewski and Evan Conrad

July 22, 2023

We’re getting a bunch of startups together that need compute for training large models
Rather than each of K startups individually buying clusters of N gpus, together we buy a cluster with N*K gpus
Then we set up a job scheduler to allocate compute fairly across all the startups (proportional to how much of the cluster they own)
- This means rather than needing to fill 128 A100s constantly over a month, you can burst up to 512 A100s for a week and get your model quicker
- Also if there’s ever idle compute, the scheduler will just give it to you, so you may end up with more than your share of compute if you get lucky
Big labs like OpenAI and Deepmind have big clusters that support this kind of bursty allocation for their researchers, but startups so far have had to get very small clusters on very long term contracts, wait months of lead time, and try to keep them busy all the time
We ought to be able to get something like $1.75/hr per H100, but with bursty allocation and short term contracts
If you’re interested in having your startup join, fill out this form

Like a hacker house, if you ever want to leave the cluster (e.g. to build your own), just give us a month or two of notice so we can find another person to fill your place
We can add new startups to the group in batches, and every couple months add new H100s to the cluster
- Same deal if you’re already in the group and you want to scale up to more compute
We may want to overprovision a little bit so that e.g. if one of our friends wants a couple nodes to run a small experiment on, we can just give it to them at a good price
- If we overprovision by 10%, this would raise the hourly H100 price by 10%

We have a good lead on 512 H100s that would come online in 4-6 weeks
- If we have more than that much demand, we can probably find more H100s that would be delivered in about 8 weeks
We can probably get a good deal from a bank to spread out the cost of buying the cluster, so we can do something like $1.75/hr for H100s, but on a short term contract, and with bursty job allocation
We can make a separate entity for this thing, so if we make any big financial mistakes, this entity dies but your startup is fine