2024 llama.cpp 7840u: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
No edit summary
Line 1: Line 1:
I briefly had a Macbook M3 Max with 64GB. It was pretty good at running local LLMs, but couldn't stand the ergonomics and not being able to run Linux, so returned it.
I briefly had a Macbook M3 Max with 64GB. It was pretty good at running local LLMs, but couldn't stand the ergonomics and not being able to run Linux, so returned it.


I picked up a Thinkpad P16s with an AMD 7840 to give Linux hardware a chance to catch up with Apple silicon. It's an amazing computer for the price, and can run LLMs. Here's how I set up llama.cpp to use ROCm.
I picked up a Thinkpad P16s with an AMD 7840u to give Linux hardware a chance to catch up with Apple silicon. It's an amazing computer for the price, and can run LLMs. Here's how I set up llama.cpp to use ROCm.


Install ROCm, set an env variable for the 780m: <code>export HSA_OVERRIDE_GFX_VERSION=11.0.0</code>
Install ROCm, set an env variable for the 780m: <code>export HSA_OVERRIDE_GFX_VERSION=11.0.0</code>
Line 107: Line 107:
  llama_print_timings:      total time =  102397.06 ms /  455 tokens
  llama_print_timings:      total time =  102397.06 ms /  455 tokens
  Log end
  Log end
It's definitely not going to win any speed prizes even though is a smaller model, but it could be ok for non time sensitive results, or where using a tiny, faster model is useful.


{{Blikied|April 13 12, 2024}}
{{Blikied|April 13 12, 2024}}

Navigation menu