mirror of
http://git.nowherejezfoltodf4jiyl6r56jnzintap5vyjlia7fkirfsnfizflqd.onion/nihilist/blog-contributions.git
synced 2025-07-02 07:26:41 +00:00
[wip] add info on configuring openwebui and start downloading models section
This commit is contained in:
parent
25ee57dcad
commit
f52a71dbc9
1 changed files with 49 additions and 3 deletions
|
@ -133,7 +133,7 @@ There're several such services including <a href="https://ppq.ai">ppq.ai</a>, <a
|
|||
<div class="container">
|
||||
<div class="row">
|
||||
<div class="col-lg-8 col-lg-offset-2">
|
||||
<h2><b>Open LLMs Primer</b></h2>
|
||||
<h2 id="open-llms-primer"><b>Open LLMs Primer</b></h2>
|
||||
<p>
|
||||
Another option available is to self-host LLM on your own infrastructure. This will effectively prevent sending your data from being sent to third parties.</p>
|
||||
|
||||
|
@ -150,7 +150,7 @@ Each open model has specified number of parameters. This can range from 0.5 bill
|
|||
<p>
|
||||
<b>Quantization (improving memory usage)</b><br>
|
||||
Usually the <b>model + context</b> needs to fit into RAM/VRAM memory. Each model parameter can be represented with certain precision. For example, <b>FP16</b> uses 16 bits (2 bytes) of memory to store a single parameter, while <b>Q4_0</b> uses only 4 bits. This means that FP16 model will use ~4x of the memory compared to Q4_0.
|
||||
Of course using Q4_0 will introduce some rounding error in quantization step, but it's usually not a big deal. Look at the graph below to see how different quantization parameters affect model accuracy and memory usage of llama 3.1 8B.
|
||||
Of course using Q4_0 will introduce some rounding error in quantization step, but it's usually not a big deal. Look at the graph below to see how different quantization parameters affect <a href="https://huggingface.co/ThomasBaruzier/Meta-Llama-3.1-8B-Instruct-GGUF">model accuracy and memory usage of llama 3.1 8B</a>:
|
||||
</p>
|
||||
|
||||
<img src="2.png" class="imgRz">
|
||||
|
@ -198,7 +198,7 @@ We'll show how to check prompt length and set appropriate context size in Open W
|
|||
<b>Solving Problems</b> - LLMs can be used as personal assistants to answer every day questions and help with personal issues.<br>
|
||||
<b>Programming Aid</b> - Developers use them for code suggestions and debugging without exposing their sensitive codebases.</p>
|
||||
|
||||
<p>It's crucial to stress that AI can hallucinate (make stuff up). Thus it's never to be fully trusted with anything important. <b>You should always check the information in reputable sources in case of any doubts</b>.</p>
|
||||
<p>It's crucial to stress that AI can hallucinate (make stuff up) thus it's never to be fully trusted with anything important. <b>You should always check the information in reputable sources in case of any doubts</b>.</p>
|
||||
</div>
|
||||
</div><!-- /row -->
|
||||
</div> <!-- /container -->
|
||||
|
@ -337,9 +337,55 @@ cat /var/lib/tor/hidden_service/hostname
|
|||
</div><!-- /white -->
|
||||
|
||||
<div id="anon2">
|
||||
<div class="container">
|
||||
<div class="row">
|
||||
<div class="col-lg-8 col-lg-offset-2">
|
||||
<h2><b>Initial Open WebUI Configuration</b></h2>
|
||||
<p>
|
||||
Go to the local IP or onion address of the Open WebUI and create admin account once you're asked to. You don't need to put any real data but save it somewhere so that you can login later.
|
||||
</p>
|
||||
<img src="10.png" class="imgRz">
|
||||
<img src="11.png" class="imgRz">
|
||||
<p><br>
|
||||
After that, you should be greeted with Open WebUI main interface and changelog popup. You can close it.
|
||||
</p>
|
||||
<img src="12.png" class="imgRz">
|
||||
|
||||
<p><br>
|
||||
Then, we'll go into the settings page and change theme to dark mode.
|
||||
</p>
|
||||
<img src="13.png" class="imgRz">
|
||||
<img src="14.png" class="imgRz">
|
||||
|
||||
<p><br>
|
||||
Go to the <b>Admin settings</b> and proceed with next steps.
|
||||
</p>
|
||||
<img src="15.png" class="imgRz">
|
||||
</div>
|
||||
</div><!-- /row -->
|
||||
</div> <!-- /container -->
|
||||
</div><!-- /white -->
|
||||
|
||||
<div id="anon1">
|
||||
<div class="container">
|
||||
<div class="row">
|
||||
<div class="col-lg-8 col-lg-offset-2">
|
||||
<h2><b>Downloading Model</b></h2>
|
||||
<p>
|
||||
To see available models, head to <a href="https://ollama.com/library">ollama library</a>. Sadly they block Tor traffic so if you have to use Tor, use their <a href="https://ollama.org.cn/library">chinese mirror</a>.<br>
|
||||
Next, pick a model you want to download. In our case, we want <b>Gemma 3</b>. Then click on <b>Tags</b> to see all available variants.
|
||||
</p>
|
||||
<img src="30.png">
|
||||
<img src="31.png" class="imgRz">
|
||||
</div>
|
||||
</div><!-- /row -->
|
||||
</div> <!-- /container -->
|
||||
</div><!-- /white -->
|
||||
|
||||
<div id="anon2">
|
||||
<div class="container">
|
||||
<div class="row">
|
||||
<div class="col-lg-8 col-lg-offset-2">
|
||||
<h2><b>Troubleshooting</b></h2>
|
||||
<p>If you encounter issues with hardware acceleration on ollama, check:</p>
|
||||
<ul>
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue