Running AI models locally with Ollama

Message

#1 Post by **Adrian** » Mon Jul 29, 2024 8:05 am

If you want to play with AI tools you don't need to give up your privacy and use online tools you can use the latest Llama3 model with Ollama

1. Download and install Ollama https://ollama.com/download (it will crate a half of gig executable in /usr/local/bin/ollama

2. You might need to start ollama "ollama serve & disown" (& to put the service in the background, disown to make sure the service doesn't close when you close that terminal)

3. Download latest llama3 model with "ollama run llama3.1" (this will download about 4.3 GB worth of model to /usr/share/ollama/.ollama) and will crate a terminal prompt where you can ask questions. It's pretty good for programming/troubleshooting tasks, like "create a bash script that goes though each line of a file and compares to the entries in another file and display only the lines that don't match" and the like. Not only that it creates the bash script but it also explains pretty well how it works. Have fun!

#2 Post by **oops** » Mon Jul 29, 2024 9:08 am

Thanks for sharing ... I will try later (under init systemd and sysvinit).

Code: Select all

$ curl -fsSL https://ollama.com/install.sh | sh
>>> Downloading ollama...
######################################################################## 100,0%##O=#  #                                     ######################################################################## 100,0%
>>> Installing ollama to /usr/local/bin...
[sudo] Mot de passe de tof-mx : 
>>> Creating ollama user...
>>> Adding ollama user to render group...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> NVIDIA GPU installed.
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.

#3 Post by **Adrian** » Mon Jul 29, 2024 9:28 am

I added a step two, I think you need to start the "server".

#4 Post by **oops** » Mon Jul 29, 2024 9:38 am

OK thanks ... I am at 7% of the 4.7GB ... I will try .

#5 Post by **MadMax** » Mon Jul 29, 2024 9:52 am

This sounds interesting. Thanks for the heads-up.

I'm not much of an AI guy, but it might be a cool thing to try on my VM machine and see how far I can go with 10 years old hardware. It has a dedicated Nvidia GPU, so it might actually be reasonable to run

#6 Post by **oops** » Mon Jul 29, 2024 11:10 am

Nice, LLaMA speaks french too ;-) ... but for the answer my CPU is about at 100%.

Code: Select all

# to check:
$ ps aux | grep ollama
user-mx   28244  5.1  4.8 29304248 594388 pts/4 Sl   16:01   1:32 ollama serve
user-mx   29642  0.5  0.2 2247280 25444 pts/5   Sl+  16:26   0:01 ollama run llama3
user-mx   29911  0.0  0.0   6708   892 pts/6    S+   16:31   0:00 grep ollama

Code: Select all

$ ollama run llama3
>>> hello
Hello! It's nice to meet you. Is there something I can help you with, or would you like to chat?

>>> Send a message (/? for help)

>>> qui es-tu?
Bonjour!

Je suis LLaMA, un modèle de langage artificiel (AI) développé par Meta AI. Je suis entraîné sur une grande quantité de données pour apprendre à 
comprendre et à générer du texte, comme si j'étais un être humain.

Mon but est de pouvoir répondre à vos questions, discuter avec vous, jouer aux jeux de langage et même créer des contenus amusants. Je suis 
programmé pour être polyglotte, c'est-à-dire que je peux communiquer dans plusieurs langues, y compris le français!

Comment puis-je vous aider aujourd'hui ?


>>> who are you ?
I'm LLaMA, a large language model developed by Meta AI. I was trained on a massive dataset to learn how to understand and generate human-like text.

My purpose is to be able to answer your questions, have conversations with you, play language games, and even create entertaining content. I'm 
designed to be polyglot, meaning I can communicate in multiple languages, including English!

I'm a machine learning model, which means I learn from the data I was trained on and improve over time. My training data includes a vast amount of 
text from various sources, allowing me to understand and respond to a wide range of topics and questions.

So, what would you like to talk about?

>>> Send a message (/? for help)

Code: Select all

>>> quels type d'information peux-tu donner ?
Excellent question !

I can provide information on a wide range of topics, including but not limited to:

1. **General knowledge**: History, science, technology, literature, arts, and more.
2. **Definitions**: I can explain the meaning of words, phrases, and concepts in various fields.
3. **Trivia**: Fun facts, interesting tidbits, and surprising statistics on diverse topics.
4. **Conversational dialogue**: I can engage in natural-sounding conversations, using context to respond to questions and statements.
5. **Language translation**: I can translate text from one language to another (including popular languages like Spanish, French, German, Chinese, 
etc.).
6. **Writing assistance**: I can help with writing tasks, such as suggesting phrases, providing grammar corrections, and offering ideas for creative 
writing.
7. **Jokes and humor**: I have a database of jokes and puns to share, so be prepared for some laughs!
8. **Storytelling**: I can generate short stories, dialogues, or even entire scripts based on prompts or themes.
9. **Personality insights**: I can provide information about personality traits, interests, and characteristics (based on popular frameworks like 
Myers-Briggs).
10. **Education and learning resources**: I can offer study tips, recommend educational content, and help with homework questions.

Remember, my capabilities are constantly evolving as I learn from user interactions and updates to my training data!

What type of information would you like me to provide?

>>> Send a message (/? for help)

#7 Post by **Wirtualny** » Mon Jul 29, 2024 11:26 am

I guess that having i5-3570S CPU and only an integrated GPU installing Ollama makes no sense. Anyway, it's a very interesting topic. I'm curious about your use cases, and how much time it took on particular configurations.

#8 Post by **oops** » Mon Jul 29, 2024 11:36 am

Wirtualny wrote: Mon Jul 29, 2024 11:26 am I guess that having i5-3570S CPU and only an integrated GPU installing Ollama makes no sense. Anyway, it's a very interesting topic. I'm curious about your use cases, and how much time it took on particular configurations.

Yes, but It is funny.

Code: Select all

$ inxi -Fxxxraz
System:
  Kernel: 6.1.96-amd64-generic-x86-64-mx19-lru-gen-2024.06.28-18.07
    arch: x86_64 bits: 64 compiler: gcc v: 8.3.0 clocksource: tsc
    avail: hpet,acpi_pm
    parameters: BOOT_IMAGE=/boot/vmlinuz-6.1.96-amd64-generic-x86-64-mx19-lru-gen-2024.06.28-18.07
    root=UUID=69185e50-6110-4a36-bc51-5b15b551ee37 ro lang=fr_FR kbd=fr,us
    kbvar=oss tz=Europe/Paris audit=0 ipv6.disable=1 libata.noacpi=1
    apparmor=0 selinux=0 resume=UUID=13fd7aeb-aa5b-497b-9e6b-a62f362582d0
    zswap.enabled=1 zswap.compressor=zstd zswap.max_pool_percent=35
    zswap.zpool=z3fold quiet
  Desktop: IceWM v: 3.6.0 tools: xautolock,xfce4-screensaver
    avail: i3lock,slock,xscreensaver vt: 7 dm: LightDM v: 1.26.0
    Distro: MX-19.4_x64 patito feo October 21 2019 base: Debian GNU/Linux 10
    (buster)
Machine:
  Type: Desktop Mobo: MSI model: B75MA-E33 (MS-7808) v: 1.0
    serial: <superuser required> uuid: <superuser required>
    UEFI: American Megatrends v: 1.7 date: 09/30/2013
CPU:
  Info: model: Intel Core i5-2320 bits: 64 type: MCP arch: Sandy Bridge
    gen: core 2 level: v2 built: 2010-12 process: Intel 32nm family: 6
    model-id: 0x2A (42) stepping: 7 microcode: 0x2F
  Topology: cpus: 1x cores: 4 smt: <unsupported> cache: L1: 256 KiB
    desc: d-4x32 KiB; i-4x32 KiB L2: 1024 KiB desc: 4x256 KiB L3: 6 MiB
    desc: 1x6 MiB
  Speed (MHz): avg: 1748 high: 2061 min/max: 1600/3300 scaling:
    driver: intel_cpufreq governor: ondemand cores: 1: 1597 2: 2061 3: 1596
    4: 1738 bogomips: 23952
  Flags: avx ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
  Vulnerabilities:
  Type: gather_data_sampling status: Not affected
  Type: itlb_multihit status: KVM: VMX disabled
  Type: l1tf mitigation: PTE Inversion; VMX: conditional cache flushes, SMT
    disabled
  Type: mds mitigation: Clear CPU buffers; SMT disabled
  Type: meltdown mitigation: PTI
  Type: mmio_stale_data status: Unknown: No mitigations
  Type: reg_file_data_sampling status: Not affected
  Type: retbleed status: Not affected
  Type: spec_rstack_overflow status: Not affected
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: Retpolines; IBPB: conditional; IBRS_FW;
    STIBP: disabled; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not
    affected
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: NVIDIA GP107 [GeForce GTX 1050 Ti] driver: nvidia v: 390.157
    non-free: 545.xx+ status: current (as of 2024-06; EOL~2026-12-xx)
    arch: Pascal code: GP10x process: TSMC 16nm built: 2016-2021 pcie: gen: 2
    speed: 5 GT/s lanes: 16 bus-ID: 01:00.0 chip-ID: 10de:1c82 class-ID: 0300
  Display: x11 server: X.Org v: 1.20.10 driver: X: loaded: nvidia
    gpu: nvidia display-ID: :0 screens: 1
  Screen-1: 0 s-res: 1280x1024 s-dpi: 85 s-size: 382x302mm (15.04x11.89")
    s-diag: 487mm (19.17")
  Monitor-1: VGA-0 res: 1280x1024 hz: 60 dpi: 86
    size: 376x301mm (14.8x11.85") diag: 482mm (18.96") modes: N/A
  API: EGL v: 1.4 hw: drv: nvidia platforms: device: egl egl: N/A drv: N/A
    gbm: drv: N/A x11: drv: nvidia inactive: wayland
  API: OpenGL v: 4.6.0 vendor: nvidia v: 390.157 glx-v: 1.4
    direct-render: yes renderer: GeForce GTX 1050 Ti/PCIe/SSE2 memory: 3.91 GiB
  API: Vulkan Message: No Vulkan data available.
Audio:
  Device-1: Intel 7 Series/C216 Family High Definition Audio
    vendor: Micro-Star MSI 7 driver: snd_hda_intel v: kernel bus-ID: 00:1b.0
    chip-ID: 8086:1e20 class-ID: 0403
  Device-2: NVIDIA GF116 High Definition Audio driver: snd_hda_intel
    v: kernel pcie: gen: 2 speed: 5 GT/s lanes: 16 bus-ID: 01:00.1
    chip-ID: 10de:0bee class-ID: 0403
  API: ALSA v: k6.1.96-amd64-generic-x86-64-mx19-lru-gen-2024.06.28-18.07
    status: kernel-api with: 1: aoss type: oss-emulator 2: apulse
    type: pulse-emulator tools: alsactl,alsamixer,amixer
  Server-1: PulseAudio v: 13.0 status: off
    tools: pacat,pactl,pamixer,pavucontrol
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    vendor: Micro-Star MSI driver: r8169 v: kernel pcie: gen: 1 speed: 2.5 GT/s
    lanes: 1 port: d000 bus-ID: 03:00.0 chip-ID: 10ec:8168 class-ID: 0200
  IF: eth0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  IF-ID-1: bridge0 state: down mac: <filter>
  IF-ID-2: bridge1 state: down mac: <filter>
  IF-ID-3: virbr0 state: down mac: <filter>
  IF-ID-4: virbr0-nic state: down mac: <filter>
  Info: services: NetworkManager,ntpd,smbd,sshd
Drives:
  Local Storage: total: 1.82 TiB used: 1.15 TiB (63.4%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/sda maj-min: 8:0 vendor: Toshiba model: HDWD120 size: 1.82 TiB
    block-size: physical: 4096 B logical: 512 B speed: 3.0 Gb/s tech: HDD
    rpm: 7200 serial: <filter> fw-rev: ACF0 scheme: GPT
Partition:
  ID-1: / raw-size: 29.3 GiB size: 28.67 GiB (97.86%) used: 20.68 GiB (72.1%)
    fs: ext4 dev: /dev/sda4 maj-min: 8:4
  ID-2: /home raw-size: 488.28 GiB size: 479.55 GiB (98.21%)
    used: 361.86 GiB (75.5%) fs: ext4 dev: /dev/sda5 maj-min: 8:5
Swap:
  Kernel: swappiness: 5 (default 60) cache-pressure: 50 (default 100)
    zswap: yes compressor: zstd max-pool: 35%
  ID-1: swap-1 type: partition size: 7.81 GiB used: 3.88 GiB (49.6%)
    priority: -2 dev: /dev/sda6 maj-min: 8:6
Sensors:
  System Temperatures: cpu: 44.0 C mobo: N/A gpu: nvidia temp: 49 C
  Fan Speeds (rpm): N/A gpu: nvidia fan: 32%
Repos:
  Packages: pm: dpkg pkgs: 3177 libs: 1505 tools: apt, apt-get, aptitude,
    deb-get, synaptic pm: rpm pkgs: 0
  No active apt repos in: /etc/apt/sources.list
  No active apt repos in: /etc/apt/sources.list.d/antix-kernel-tmp.list
  Active apt repos in: /etc/apt/sources.list.d/debian-stable-updates.list
    1: deb http://deb.debian.org/debian buster-updates main contrib non-free
  Active apt repos in: /etc/apt/sources.list.d/debian.list
    1: deb http://deb.debian.org/debian buster main contrib non-free
    2: deb http://deb.debian.org/debian-security buster/updates main contrib non-free
  Active apt repos in: /etc/apt/sources.list.d/extended-lts.list
    1: deb http://deb.freexian.com/extended-lts buster-lts main contrib non-free
    2: deb http://deb.freexian.com/extended-lts buster main contrib non-free
  Active apt repos in: /etc/apt/sources.list.d/google-earth-pro.list
    1: deb [arch=amd64] http://dl.google.com/linux/earth/deb/ stable main
  Active apt repos in: /etc/apt/sources.list.d/mozilla.list
    1: deb [signed-by=/etc/apt/keyrings/packages.mozilla.org.asc] https://packages.mozilla.org/apt mozilla main
  Active apt repos in: /etc/apt/sources.list.d/mx.list
    1: deb https://mxrepo.com/mx/repo/ buster main non-free
    2: deb https://mxrepo.com/mx/repo/ buster ahs
  No active apt repos in: /etc/apt/sources.list.d/nvidia-drivers.list
  No active apt repos in: /etc/apt/sources.list.d/skype-stable.list
  No active apt repos in: /etc/apt/sources.list.d/thorium.list
  No active apt repos in: /etc/apt/sources.list.d/various.list
Info:
  Memory: total: 12 GiB available: 11.61 GiB used: 4.98 GiB (42.9%)
  Processes: 340 Power: uptime: 7h 5m states: freeze,mem,disk suspend: deep
    avail: s2idle wakeups: 0 hibernate: platform avail: shutdown, reboot,
    suspend, test_resume image: 4.6 GiB services: upowerd,xfce4-power-manager
    Init: SysVinit v: 2.96 runlevel: 5 default: multi-user tool: systemctl
  Compilers: gcc: 8.3.0 Shell: Bash v: 5.0.3 running-in: roxterm
    inxi: 3.3.35

#9 Post by **Adrian** » Mon Jul 29, 2024 12:01 pm

Seems to be working acceptably fast on my 3 years old desktop that doesn't even have a discrete GPU

#10 Post by **oops** » Mon Jul 29, 2024 12:04 pm

Adrian wrote: Mon Jul 29, 2024 12:01 pm Seems to be working acceptably fast on my 3 years old desktop that doesn't even have a discrete GPU

Me too. (I have just to wait a bit for the answers and my CPU is at 100% for few seconds)

MX Linux Forum

Running AI models locally with Ollama

Running AI models locally with Ollama

Re: Running AI models locally with Ollama

Re: Running AI models locally with Ollama

Re: Running AI models locally with Ollama

Re: Running AI models locally with Ollama

Re: Running AI models locally with Ollama

Re: Running AI models locally with Ollama

Re: Running AI models locally with Ollama

Re: Running AI models locally with Ollama

Re: Running AI models locally with Ollama