2024 llama.cpp 7840u

I briefly had a Macbook M3 Max with 64GB. It was pretty good at running local LLMs, but couldn't stand the ergonomics and not being able to run Linux, so returned it.

I picked up a Thinkpad P16s with an AMD 7840 to give Linux hardware a chance to catch up with Apple silicon. It's an amazing computer for the price, and can run LLMs. Here's how I set up llama.cpp to use ROCm.

Install ROCm, set an env variable for the 780m: export HSA_OVERRIDE_GFX_VERSION=11.0.0

clone llama.cpp and compile it:

make -j16 LLAMA_HIPBLAS=1 LLAMA_HIP_UMA=1 AMDGPU_TARGETS=gfx1030 DLLAMA_HIP_UMA= ON

run it like this:

./main -m /home/vid/jan/models/mistral-ins-7b-q4/mistral-7b-instruct-v0.2.Q4_K_M .gguf -p "example code for a lit Web Component that reverses a string" -n 50 -e -ngl 16 -n -1

RSS

"April 13 12, 2024" contains more than three components required for a date interpretation. Blikied on April 13 12, 2024