Thursday, January 30, 2025

Re: NEW emulators/llama.cpp b4589

On 2025/01/30 10:03, Chris Cappuccio wrote:
> Stuart Henderson [stu@spacehopper.org] wrote:
> >
> > I'd be happy with misc. If we end up with dozens of related ports then
> > maybe a new category makes sense but misc seems to fit and is not over-full.
>
> Ok, here's a new spin for misc/llama.cpp with your patch applied.
>
> Using this model an AMD EPYC 7313, I am getting 10 tokens/sec:
>
> llama-cli --model DeepSeek-R1-Distill-Qwen-7B-Q8_0.gguf -c 131072 --threads 16 --temp 0.6
>
> With enough RAM you could run the actual DeepSeek R1. The distilled Qwen 7B is less useful.

ok sthen

No comments:

Post a Comment