-
@ Daniel Wigton
2025-05-24 07:45:08Oh good grief. 😂 Then yeah, it is all explained by your RAM speed, you aren't going to get faster on that machine. You can do a Mixture of Experts Model, like llama 4, to get the number of active parameters down, but it is still going to be slow and performance will be worse than a full model like llama3.3. Fast memory is everything for AI workloads.