|
|
Line 1: |
Line 1: |
| {{#setmainimage:https://wiki.zooid.org/images/c/c5/Ipectinstat1-sq-trans2.png|thumb|right}}
| | Please join in. Below are some pages that will help you use the wiki's content or participate in its technical development. |
|
| |
|
| I briefly had a Macbook M3 Max with 64GB. It was pretty good at running local LLMs, but couldn't stand the ergonomics and not being able to run Linux, so returned it.
| | * [[Development]] |
|
| |
|
| I picked up a Thinkpad P16s with an AMD 7840u to give Linux hardware a chance to catch up with Apple silicon. It's an amazing computer for the price, and can run LLMs. Here's how I set up llama.cpp to use ROCm.
| | As of now, the site has {{ #ask: [[Category:Person]] |
| | |format=count |
| | }} people, {{ #ask: [[Category:Group]] |
| | |format=count |
| | }} groups, and {{ #ask: [[Category:Event]] |
| | |format=count |
| | }} events. |
|
| |
|
| Install ROCm, set an env variable for the 780m: <code>export HSA_OVERRIDE_GFX_VERSION=11.0.0</code>
| | Artists without images: {{ #ask: [[Category:Type]]|[[Image::<]] }} |
| | |
| clone llama.cpp and compile it:
| |
| | |
| <code>make -j16 LLAMA_HIPBLAS=1 LLAMA_HIP_UMA=1 AMDGPU_TARGETS=gfx1030 DLLAMA_HIP_UMA=
| |
| ON</code>
| |
|
| |
| run it like this:
| |
|
| |
| <code>./main -m /home/vid/jan/models/mistral-ins-7b-q4/mistral-7b-instruct-v0.2.Q4_K_M
| |
| .gguf -p "example code for a lit Web Component that reverses a string" -n 50 -e
| |
| -ngl 33 -n -1</code>
| |
| | |
| ggml_cuda_init: found 1 ROCm devices:
| |
| Device 0: AMD Radeon Graphics, compute capability 11.0, VMM: no
| |
| llm_load_tensors: ggml ctx size = 0.22 MiB
| |
| llm_load_tensors: offloading 32 repeating layers to GPU
| |
| llm_load_tensors: offloading non-repeating layers to GPU
| |
| llm_load_tensors: offloaded 33/33 layers to GPU
| |
| llm_load_tensors: ROCm0 buffer size = 4095.05 MiB
| |
| llm_load_tensors: CPU buffer size = 70.31 MiB
| |
| ..............................................................................................
| |
| llama_new_context_with_model: n_ctx = 512
| |
| llama_new_context_with_model: n_batch = 512
| |
| llama_new_context_with_model: n_ubatch = 512
| |
| llama_new_context_with_model: freq_base = 1000000.0
| |
| llama_new_context_with_model: freq_scale = 1
| |
| llama_kv_cache_init: ROCm0 KV buffer size = 64.00 MiB
| |
| llama_new_context_with_model: KV self size = 64.00 MiB, K (f16): 32.00 MiB, V (f16): 32.00 MiB
| |
| llama_new_context_with_model: ROCm_Host output buffer size = 0.12 MiB
| |
| llama_new_context_with_model: ROCm0 compute buffer size = 81.00 MiB
| |
| llama_new_context_with_model: ROCm_Host compute buffer size = 9.01 MiB
| |
| llama_new_context_with_model: graph nodes = 1030
| |
| llama_new_context_with_model: graph splits = 2
| |
|
| |
| system_info: n_threads = 8 / 16 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 |
| |
| sampling:
| |
| repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
| |
| top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
| |
| mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
| |
| sampling order:
| |
| CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temperature
| |
| generate: n_ctx = 512, n_batch = 2048, n_predict = -1, n_keep = 1
| |
|
| |
|
| |
| simple example code for a lit Web Component that reverses a string
| |
|
| |
| ```javascript
| |
| import { LitElement, html } from 'lit';
| |
| import { customElement, property } from 'lit/decorators.js';
| |
|
| |
| @customElement('reverse-string')
| |
| class ReverseString extends LitElement {
| |
|
| |
| static styles = css`
| |
| :host {
| |
| display: block;
| |
| }
| |
| `;
| |
|
| |
| @property({ type: String }) input = '';
| |
|
| |
| render() {
| |
| return html`
| |
| <input type="text" value=${this.input} @input=${this._handleInput} />
| |
| <button @click=${this._reverse}>Reverse</button>
| |
| < p>${this._reversed}< /p>
| |
| `;
| |
| }
| |
|
| |
| _handleInput(event) {
| |
| this.input = event.target.value;
| |
| }
| |
|
| |
| _reverse() {
| |
| this._reversed = this.input.split('').reverse().join('');
| |
| }
| |
|
| |
| @property private _reversed = '';
| |
| }
| |
| ```
| |
|
| |
| ```css
| |
| :host {
| |
| display: block;
| |
| }
| |
| ```
| |
|
| |
| This is a simple example of a Lit Web Component that reverses a string. The component has an input field for the user to enter a string, and a button to reverse the string when clicked. The reversed string is displayed below the button.
| |
|
| |
| The component uses the Lit library to define the custom element, and uses the `@customElement` decorator to define the element's name as 'reverse-string'. The `@property` decorator is used to define the input property, and the `static styles` property is used to define the component's styles.
| |
|
| |
| In the `render` method, the input field and button are created using template literals, and the reversed string is displayed using a reactive property `_reversed`.
| |
|
| |
| The input field's value is updated in the `_handleInput` method when the user types in the field, and the string is reversed in the `_reverse` method when the button is clicked. The reversed string is then assigned to the `_reversed` property, which updates the displayed string. [end of text]
| |
|
| |
| llama_print_timings: load time = 2488.76 ms
| |
| llama_print_timings: sample time = 24.92 ms / 485 runs ( 0.05 ms per token, 19462.28 tokens per second)
| |
| llama_print_timings: prompt eval time = 576.45 ms / 14 tokens ( 41.18 ms per token, 24.29 tokens per second)
| |
| llama_print_timings: eval time = 38985.11 ms / 484 runs ( 80.55 ms per token, 12.41 tokens per second)
| |
| llama_print_timings: total time = 39890.17 ms / 498 tokens
| |
| Log end
| |
| | |
| | |
| It's definitely not going to win any speed prizes even though is a smaller model, but it could be ok for non time sensitive results, or where using a tiny, faster model is useful.
| |
| | |
| {{Blikied|April 13, 2024}}
| |