There is a lower bar (that gets lower over time), but ime, the config you are describing is too low still.
qwen/gemma in the 27/35B range @fp8 are better than gemini-2.5, but less than gemini-3.1, you can run DS4-flash @fp8 on two DGX spark, and things keep becoming better. DiffusionGemma came out recently with 4x token gen speeds.
tl;dr - the models you appear to be trying with are too small or too quant'd
qwen/gemma in the 27/35B range @fp8 are better than gemini-2.5, but less than gemini-3.1, you can run DS4-flash @fp8 on two DGX spark, and things keep becoming better. DiffusionGemma came out recently with 4x token gen speeds.
tl;dr - the models you appear to be trying with are too small or too quant'd