This is the story of how I bought enterprise-grade AI hardware designed for liquid-cooled server racks that was converted to air cooling, and then back again, survived multiple near-disasters (including GPUs reporting temperatures of 16 million degrees), and ended up with a desktop that can run 235B parameter models at home. It’s a tale of questionable decisions, creative problem-solving, and what happens when you try to turn datacenter equipment into a daily driver.
amirhirsch 5 hours ago [-]
# Tell the driver to completely ignore the NVLINK and it should allow the GPUs to initialise independently over PCIe !!!! This took a week of work to find, thanks Reddit!
I needed this info, thanks for putting it up. Can this really be an issue for every data center?
ipsum2 5 hours ago [-]
I saw the same post on Reddit and was so tempted to purchase it, but I live in the US. Cool to see it wasn't a scam!
pointbob 1 hours ago [-]
Loved it. You are mgyver. You should post more stuff on Twitter. Thanks for the story.
6 hours ago [-]
dauertewigkeit 4 hours ago [-]
It's a very interesting read, but a lot is not clear.
How does the seller get these desktops directly from NVIDIA?
And if the seller's business is custom made desktop boxes, why didn't he just fit the two H100s into a better desktop box?
Ntrails 4 hours ago [-]
> why didn't he just fit the two H100s into a better desktop box?
I expect because they were no longer in the sort of condition to sell as new machines? They were clearly well used and selling "as seen" is the lowest reputational risk associated with offload
ProAm 1 hours ago [-]
Which is how you learn to become an expert. I love it
systemtest 4 hours ago [-]
Love how a €7.5k 20 kilogram server is placed on a €5 particleboard table. I have owned several LACKs but would never put anything valuable on it. IKEA rates them at 25 kilogram maximum load.
Ao7bei3s 3 hours ago [-]
LACK tables specifically are well proven to be quite sturdy actually. They happen to be just the right width for servers / network devices, and so people have used them for that purpose for ages. Search for "LACK rack", or see e.g. https://wiki.eth0.nl/index.php/LackRack. 20kg is nothing; I've personally put >100kg on top.
2 hours ago [-]
rtkwe 2 hours ago [-]
They're a bit less usable that way now. The legs are basically completely hollow these days so you're not actually able to bear much weight on the screws so the only option is stacking the items so the weight is born by whatever surface is below the "rack" at which point you could just as easily call stacking the equipment an air rack (or an iLackaRack maybe /s).
ivanjermakov 3 hours ago [-]
Whole 25% safety margin!
rtkwe 2 hours ago [-]
Well to be fair their quoted rating has it's own built in margin. So you're already stacking safety margins.
While this is undoubtably still an excellent deal, the comparison to the new price of H100 is a bit misleading, since today you can buy a new, legit RTX 6000 Pro for about $7-8k, and get similar performance the first two of the models tested at least. As a bonus those can fit in a regular workstation or server, and you can buy multiple. This thing is not worth $80k in the same way that any old enterprise equipment is not worth nearly as much as its price when it was new.
skizm 5 hours ago [-]
Serious question: does this thing actually make games run really great? Or are they so optimized for AI/ML workloads that they either don’t work or run normal video games poorly?
Also:
> I arrived at a farmhouse in a small forest…
Were you not worried you were going to get murdered?
zamadatix 2 hours ago [-]
I think the point of negative returns for gaming is going above the RTX PRO 6000 Blackwell + AMD 9800X3D CPU + latency optimized RAM + any decent NVMe drive. Seems to net ~1.1x more performance than a normal 5090 in the same setup (and both can be overclocked about equally). Aside from what the GPU is optimized for, the CPU in these servers being ARM based ends up adding more overhead for games (and breaks DRM) which still assume x86 on Windows/Linux.
jaggirs 5 hours ago [-]
I believe these gpus dont have direct hdmi/DisplayPort outputs, so at the very least its tricky to even run a game on them, I guess you need to run the game in a VM or so?
the8472 3 hours ago [-]
Copying between GPUs is a thing, that's how integrated/discrete GPU switching works. So if the drivers provide full vulkan support then rendering on the nvidia and copying to another GPU with outputs could work.
And it's an ARM CPU, so to run most games you need emulation (Wine+FEX), but Valve has been polishing that for their steamframe... so maybe?
People have gotten games to run on a DGX Spark, which is somewhat similar (GB10 instead of GH200)
Havoc 4 hours ago [-]
>Serious question: does this thing actually make games run really great?
LTT tried it in one of their videos...forgot which card but one of the serious nvidia AI cards.
...it runs like shit for gaming workloads. It does the job but comfortably beaten by a mid tier consumer card for 1/10th the price
Their AI track datacenter cards are definitely not same thing different badge glued on
4 hours ago [-]
mrandish 4 hours ago [-]
> does this thing actually make games run really great
It's an interesting question, and since OP indicates he previously had a 4090, he's qualified to reply and hopefully will. However, I suspect the GH200 won't turn out to run games much faster than a 5090 because A) Games aren't designed to exploit the increased capabilities of this hardware, and B) The GH200 drivers wouldn't be tuned for game performance. One of the biggest differences of datacenter AI GPUs is the sheer memory size, and there's little reason for a game to assume there's more than 16GB of video memory available.
More broadly, this is a question that, for the past couple decades, I'd have been very interested in. For a lot of years, looking at today's most esoteric, expensive state-of-the-art was the best way to predict what tomorrow's consumer desktop might be capable of. However, these days I'm surprised to find myself no longer fascinated by this. Having been riveted by the constant march of real-time computer graphics from the 90s to 2020 (including attending many Siggraph conferences in the 90s and 00s), I think we're now nearing the end of truly significant progress in consumer gaming graphics.
I do realize that's a controversial statement, and sure there will always be a way to throw more polys, bigger textures and heavier algorithms at any game, but... each increasing increment just doesn't matter as much as it once did. For typical desktop and couch consumer gaming, the upgrade from 20fps to 60fps was a lot more meaningful to most people than 120fps to 360fps. With synthetic frame and pixel generation, increasing resolution beyond native 4K matters less. (Note: head-mounted AR/VR might one of the few places 'moar pixels' really matters in the future). Sure, it can look a bit sharper, a bit more varied and the shadows can have more perfect ray-traced fall-off, but at this point piling on even more of those technically impressive feats of CGI doesn't make the game more fun to play, whether on a 75" TV at 8 feet or a 34-inch monitor at two feet. As an old-school computer graphics guy, it's incredible to be see real-time path tracing adding subtle colors to shadows from light reflections bouncing off colored walls. It's living in the sci-fi future we dreamed of at Siggraph '92. But as a gamer looking for some fun tonight, honestly... the improved visuals don't contribute much to the overall gameplay between a 3070, 4070 and 5070.
Scene_Cast2 4 hours ago [-]
I'd guess that the datacenter "GPUs" lack all the fixed-function graphics hardware (texture samplers, etc) that's still there in modern consumer GPUs.
jsheard 1 hours ago [-]
They do still have texture units since sampling 2D and 3D grids is a useful primitive for all sorts of compute, but some other stuff is stripped back. They don't have raytracing or video encoding units for example.
Beijinger 1 hours ago [-]
I would appreciate it if someone could name some shops where you can buy used enterprise grade equipment.
Most of them are in California? Anything in NY/NJ
bombcar 11 minutes ago [-]
Look on eBay, find sellers with multiple listings, track them down.
There should be some all over the country.
volf_ 6 hours ago [-]
That's awesome.
These are the best kinds of posts
BizarroLand 6 hours ago [-]
Yep. Just enough to inspire jealousy while also saying it's possible
m4r1k 4 hours ago [-]
Wow! As others have said, deal of the century!! As a side note, a few years back, I used to scrape eBay for Intel QS Xeon and quite a few times managed to snag incredible deals, but this is beyond anything anyone has ever achieved!
Frannky 4 hours ago [-]
Wow! Kudos for thinking it was possible and making it happen. I was wondering how long it would be before big local models were possible under 10k—pretty impressive. Qwen3-235B can do mundane chat, coding, and agentic tasks pretty well.
jauntywundrkind 3 hours ago [-]
I feel like it's going to be a long long time before we get a repeat of something like this. And David did such an incredible job on this. Custom designed frame, designed his own water-block! Wildly great effort here.
Nothing makes you feel more "I've been there" than typing inscrutable arcana to get a GPU working for ML work...
tigranbs 5 hours ago [-]
Ah, that's the best way to spend ~10K
jauntywundrkind 3 hours ago [-]
What an incredible barn-find type story. Incredible. And you are among very few buyers who could have so lovingly done such an incredible job debugging driver & motherboard issues. Please add a kitsch Serial Experiment Lain themed computing shrine around this incredible work, and all's done.
> 4x Arctic Liquid Freezer III 420 (B-Ware) - €180
Quite aside, but man: I fricking love Arctic. Seeing their fans in the new Corsi-Rosenthal boxes has been awesome. Such good value. I've been sing a Liquid Freeze II after nearly buying my last air-cooled heat-sink & seeing the LF-II onsale for <$75. Buy.
Please give us some power consumption figures! I'm so curious how it scales up and down. Do different models take similar or different power? Asking a lot, but it'd be so neat to see a somewhat high res view (>1 sample/s) of power consumption (watts) on these things, such a unique opportunity.
Tenemo 10 minutes ago [-]
Huge fan of those AIOs as well! I have LFIII 420mm in my PC and I've successfully built a 10x10cm cloud chamber with another one which is really pushing it as far as it can go.
MLgulabio 5 hours ago [-]
Argh i was so so hoping that this is a 'thing' and I can just do that too.
Lets continue to hope
20after4 5 hours ago [-]
Deal of the century.
arein3 6 hours ago [-]
It's practically free
ionwake 4 hours ago [-]
inspiring! is there an ip i can connect to test the inference speed?
pointbob 1 hours ago [-]
Can you bitcoin mine?
Philpax 5 hours ago [-]
You lucky dog. Have fun!
4 hours ago [-]
ChrisArchitect 6 hours ago [-]
Maybe the title could be I bought an Nvidia server.....
to avoid confusion that it's something to do with Grace Hopper the person, and her servers ...or mainframes?
dnhkng 6 hours ago [-]
Makes sense. I'm so used to the naming I forgot it's not common knowledge. I hope the new title is clearer.
walrus01 6 hours ago [-]
Grace Hopper is the Nvidia product code name for the chip, much like how Intel cpus were named after rivers, etc
I needed this info, thanks for putting it up. Can this really be an issue for every data center?
How does the seller get these desktops directly from NVIDIA?
And if the seller's business is custom made desktop boxes, why didn't he just fit the two H100s into a better desktop box?
I expect because they were no longer in the sort of condition to sell as new machines? They were clearly well used and selling "as seen" is the lowest reputational risk associated with offload
Also:
> I arrived at a farmhouse in a small forest…
Were you not worried you were going to get murdered?
People have gotten games to run on a DGX Spark, which is somewhat similar (GB10 instead of GH200)
LTT tried it in one of their videos...forgot which card but one of the serious nvidia AI cards.
...it runs like shit for gaming workloads. It does the job but comfortably beaten by a mid tier consumer card for 1/10th the price
Their AI track datacenter cards are definitely not same thing different badge glued on
It's an interesting question, and since OP indicates he previously had a 4090, he's qualified to reply and hopefully will. However, I suspect the GH200 won't turn out to run games much faster than a 5090 because A) Games aren't designed to exploit the increased capabilities of this hardware, and B) The GH200 drivers wouldn't be tuned for game performance. One of the biggest differences of datacenter AI GPUs is the sheer memory size, and there's little reason for a game to assume there's more than 16GB of video memory available.
More broadly, this is a question that, for the past couple decades, I'd have been very interested in. For a lot of years, looking at today's most esoteric, expensive state-of-the-art was the best way to predict what tomorrow's consumer desktop might be capable of. However, these days I'm surprised to find myself no longer fascinated by this. Having been riveted by the constant march of real-time computer graphics from the 90s to 2020 (including attending many Siggraph conferences in the 90s and 00s), I think we're now nearing the end of truly significant progress in consumer gaming graphics.
I do realize that's a controversial statement, and sure there will always be a way to throw more polys, bigger textures and heavier algorithms at any game, but... each increasing increment just doesn't matter as much as it once did. For typical desktop and couch consumer gaming, the upgrade from 20fps to 60fps was a lot more meaningful to most people than 120fps to 360fps. With synthetic frame and pixel generation, increasing resolution beyond native 4K matters less. (Note: head-mounted AR/VR might one of the few places 'moar pixels' really matters in the future). Sure, it can look a bit sharper, a bit more varied and the shadows can have more perfect ray-traced fall-off, but at this point piling on even more of those technically impressive feats of CGI doesn't make the game more fun to play, whether on a 75" TV at 8 feet or a 34-inch monitor at two feet. As an old-school computer graphics guy, it's incredible to be see real-time path tracing adding subtle colors to shadows from light reflections bouncing off colored walls. It's living in the sci-fi future we dreamed of at Siggraph '92. But as a gamer looking for some fun tonight, honestly... the improved visuals don't contribute much to the overall gameplay between a 3070, 4070 and 5070.
Most of them are in California? Anything in NY/NJ
There should be some all over the country.
These are the best kinds of posts
We'll see how it goes, but what _is_ happening is ram replacement. Nvidia 5090's with 96GB are somewhat a thing now. $4K. YMMV, caveat emptor. https://www.alibaba.com/product-detail/Newest-RTX-5090-96gb-...
How long would it take to recoup the cost if you made the model available for others to run inference at the same price as the big players?
Assumptions:
Batch 4x and get 400 tokens per second and push his power consumption to 900W instead of the underutilized 300W.
Electricity around €0.2/kWhr.
Tokens valued at €1/1M out.
Assume ~70% utilization.
Result:
You get ~1M tokens per hour which is a net profit of ~€0.8/hr. Which is a payoff time of a bit over a year or so given the €9K investment.
Honestly though there is a lot of handwaving here. The most significant unknown is getting high utilization with aggressive batching and 24/7 load.
Also the demand for privacy can make the utility of the tokens much higher than typical API prices for open source models.
In a sort of orthogonal way renting 2 H100s costs around $6 per hour which makes the payback time a bit over a couple months.
GLM 4.5 Air, to be precise. It's a smaller 166B model, not the full 355B one.
Worth mentioning when discussing token throughput.
> # Data Center/HGX-Series/HGX H100/Linux aarch64/12.8 seem to work! wget https://us.download.nvidia.com/tesla/570.195.03/NVIDIA-Linux...
> ...
Nothing makes you feel more "I've been there" than typing inscrutable arcana to get a GPU working for ML work...
> 4x Arctic Liquid Freezer III 420 (B-Ware) - €180
Quite aside, but man: I fricking love Arctic. Seeing their fans in the new Corsi-Rosenthal boxes has been awesome. Such good value. I've been sing a Liquid Freeze II after nearly buying my last air-cooled heat-sink & seeing the LF-II onsale for <$75. Buy.
Please give us some power consumption figures! I'm so curious how it scales up and down. Do different models take similar or different power? Asking a lot, but it'd be so neat to see a somewhat high res view (>1 sample/s) of power consumption (watts) on these things, such a unique opportunity.
Lets continue to hope
https://www.google.com/search?client=firefox-b-m&q=grace%20h...