2,153
edits
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
I briefly had a Macbook M3 Max with 64GB. It was pretty good at running local LLMs, but couldn't stand the ergonomics and not being able to run Linux, so returned it. | I briefly had a Macbook M3 Max with 64GB. It was pretty good at running local LLMs, but couldn't stand the ergonomics and not being able to run Linux, so returned it. | ||
I picked up a Thinkpad P16s with an AMD | I picked up a Thinkpad P16s with an AMD 7840u to give Linux hardware a chance to catch up with Apple silicon. It's an amazing computer for the price, and can run LLMs. Here's how I set up llama.cpp to use ROCm. | ||
Install ROCm, set an env variable for the 780m: <code>export HSA_OVERRIDE_GFX_VERSION=11.0.0</code> | Install ROCm, set an env variable for the 780m: <code>export HSA_OVERRIDE_GFX_VERSION=11.0.0</code> | ||
Line 107: | Line 107: | ||
llama_print_timings: total time = 102397.06 ms / 455 tokens | llama_print_timings: total time = 102397.06 ms / 455 tokens | ||
Log end | Log end | ||
It's definitely not going to win any speed prizes even though is a smaller model, but it could be ok for non time sensitive results, or where using a tiny, faster model is useful. | |||
{{Blikied|April 13 12, 2024}} | {{Blikied|April 13 12, 2024}} |