2024 llama.cpp 7840u: Difference between revisions

From zooid Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
I briefly had a Macbook M3 Max with 64GB. It was pretty good at running local LLMs, but couldn't stand the ergonomics and not being able to run Linux, so returned it.
I briefly had a Macbook M3 Max with 64GB. It was pretty good at running local LLMs, but couldn't stand the ergonomics and not being able to run Linux, so returned it.


Line 72: Line 73:
     `;
     `;
   }
   }
 
   _handleInput(event) {
   _handleInput(event) {
     this.input = event.target.value;
     this.input = event.target.value;
   }
   }
 
   _reverse() {
   _reverse() {
     this._reversed = this.input.split('').reverse().join('');
     this._reversed = this.input.split('').reverse().join('');
   }
   }
 
   @property private _reversed = '';
   @property private _reversed = '';
  }
  }
Line 107: Line 108:




{{Blikied|September 27, 2014}}
It's definitely not going to win any speed prizes even though is a smaller model, but it could be ok for non time sensitive results, or where using a tiny, faster model is useful.
It's definitely not going to win any speed prizes even though is a smaller model, but it could be ok for non time sensitive results, or where using a tiny, faster model is useful.


{{Blikied|April 13, 2024}}
{{Blikied|April 13, 2024}}

Latest revision as of 17:34, 14 April 2024

I briefly had a Macbook M3 Max with 64GB. It was pretty good at running local LLMs, but couldn't stand the ergonomics and not being able to run Linux, so returned it.

I picked up a Thinkpad P16s with an AMD 7840u to give Linux hardware a chance to catch up with Apple silicon. It's an amazing computer for the price, and can run LLMs. Here's how I set up llama.cpp to use ROCm.

Install ROCm, set an env variable for the 780m: export HSA_OVERRIDE_GFX_VERSION=11.0.0

clone llama.cpp and compile it:

make -j16 LLAMA_HIPBLAS=1 LLAMA_HIP_UMA=1 AMDGPU_TARGETS=gfx1030 DLLAMA_HIP_UMA= ON

run it like this:

./main -m /home/vid/jan/models/mistral-ins-7b-q4/mistral-7b-instruct-v0.2.Q4_K_M .gguf -p "example code for a lit Web Component that reverses a string" -n 50 -e -ngl 33 -n -1

ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon Graphics, compute capability 11.0, VMM: no
llm_load_tensors: ggml ctx size =    0.22 MiB
llm_load_tensors: offloading 32 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 33/33 layers to GPU
llm_load_tensors:      ROCm0 buffer size =  4095.05 MiB
llm_load_tensors:        CPU buffer size =    70.31 MiB
..............................................................................................
llama_new_context_with_model: n_ctx      = 512
llama_new_context_with_model: n_batch    = 512
llama_new_context_with_model: n_ubatch   = 512
llama_new_context_with_model: freq_base  = 1000000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init:      ROCm0 KV buffer size =    64.00 MiB
llama_new_context_with_model: KV self size  =   64.00 MiB, K (f16):   32.00 MiB, V (f16):   32.00 MiB
llama_new_context_with_model:  ROCm_Host  output buffer size =     0.12 MiB
llama_new_context_with_model:      ROCm0 compute buffer size =    81.00 MiB
llama_new_context_with_model:  ROCm_Host compute buffer size =     9.01 MiB
llama_new_context_with_model: graph nodes  = 1030
llama_new_context_with_model: graph splits = 2

system_info: n_threads = 8 / 16 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 |
sampling:
        repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
        top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
        mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampling order:
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temperature
generate: n_ctx = 512, n_batch = 2048, n_predict = -1, n_keep = 1


 simple example code for a lit Web Component that reverses a string

```javascript
import { LitElement, html } from 'lit';
import { customElement, property } from 'lit/decorators.js';

@customElement('reverse-string')
class ReverseString extends LitElement {

  static styles = css`
    :host {
      display: block;
    }
  `;

  @property({ type: String }) input = ;

  render() {
    return html`
      <input type="text" value=${this.input} @input=${this._handleInput} />
      <button @click=${this._reverse}>Reverse</button>
      < p>${this._reversed}< /p>
    `;
  }

  _handleInput(event) {
    this.input = event.target.value;
  }

  _reverse() {
    this._reversed = this.input.split().reverse().join();
  }

  @property private _reversed = ;
}
```

```css
:host {
  display: block;
}
```

This is a simple example of a Lit Web Component that reverses a string. The component has an input field for the user to enter a string, and a button to reverse the string when clicked. The reversed string is displayed below the button.

The component uses the Lit library to define the custom element, and uses the `@customElement` decorator to define the element's name as 'reverse-string'. The `@property` decorator is used to define the input property, and the `static styles` property is used to define the component's styles.

In the `render` method, the input field and button are created using template literals, and the reversed string is displayed using a reactive property `_reversed`.

The input field's value is updated in the `_handleInput` method when the user types in the field, and the string is reversed in the `_reverse` method when the button is clicked. The reversed string is then assigned to the `_reversed` property, which updates the displayed string. [end of text]
 
llama_print_timings:        load time =    2488.76 ms
llama_print_timings:      sample time =      24.92 ms /   485 runs   (    0.05 ms per token, 19462.28 tokens per second)
llama_print_timings: prompt eval time =     576.45 ms /    14 tokens (   41.18 ms per token,    24.29 tokens per second)
llama_print_timings:        eval time =   38985.11 ms /   484 runs   (   80.55 ms per token,    12.41 tokens per second)
llama_print_timings:       total time =   39890.17 ms /   498 tokens
Log end


It's definitely not going to win any speed prizes even though is a smaller model, but it could be ok for non time sensitive results, or where using a tiny, faster model is useful.


Créer la version française

Error creating thumbnail: File missing
2024 llama.cpp 7840u


Location

Toronto


Lata Pada is a Canadian choreographer and Bharatanatyam dancer of Indian descent. Pada is the Founder and Artistic Director of Sampradaya Dance Creations, a dance Company that performs South Asian dance. She is also the Founder and Director of Sampradaya Dance Academy, a leading professional dance training institution that is the only South Asian dance school in North America affiliated with the prestigious, UK-based Imperial Society for Teachers of Dancing.Pada founded the dance company in 1990 because she wanted to showcase Bharatantyam dance as an art form throughout the world.

Pada, who attended Elphinstone College in Mumbai, trained under the gurus Kalaimamani Kalyanasundaram and Padmabhushan Kalanidhi Narayanan.Pada lives in Mississauga, near Toronto. Pada married geologist Vishnu Pada when she was 17 years old.

In 1985 Lata Pada and her family decided to take an extended vacation to India. On June 23 of that year Vishnu Pada and daughters Arti and Brinda died in the bombing of Air India Flight 182. Lata Pada was not aboard since she left on an earlier date to tour India for Bharatanatyam recitals in Bangalore and across India; Lata was in Mumbai rehearsing for her tour, while her husband and daughters stayed behind in Sudbury, Ontario because Brinda was graduating from high school; afterwards the three flew on Air India 182. Lata Pada became a spokesperson for the families of the victims. After the crash she created the dance piece "Revealed By Fire" in remembrance of the incident. Pada received a master's degree in fine arts from York University in 1997.

Pada married Hari Venkatacharya in September, 2000. Venkatacharya is an entrepreneur and was Managing Director of Nytric Business Partners and is the Immediate Past President of TiE Toronto. He also serves on the Boards of the Ontario Science Centre and Fields Institute for Research in Mathematical Sciences. They both met while founding the South Asian advisory committee at the Royal Ontario Museum in 1995, where they raised over $3 million Canadian dollars for Canada's first permanent South Asian Gallery.

In December 2008, she was made a Member of the Order of Canada for her contributions to the development of Bharatanatyam as a choreographer, teacher, dancer and artistic director, as well as for her commitment and support of the Indian community in Canada. Lata was also recently appointed as Adjunct Professor in the Graduate Faculty of Dance, York University, Toronto.


This article based on content from http://www.wikipedia.org. Original version: http://en.wikipedia.org/wiki/Lata_Pada