Cyberdevs Posted Wednesday at 10:11 PM Share Posted Wednesday at 10:11 PM 5 minutes ago, engeldlgado said: Thanks for the feedback. Sure thing man. 6 minutes ago, engeldlgado said: Was the AI in the middle of generating a response when this error occurred? No I just wanted to test and see how the app handles file attachment and attached several files and the error occurred, but it was able to analyze a single somewhat short text file without any errors. I have to say that the files I've attached were pretty large files so I guess that's what cause the error. 8 minutes ago, engeldlgado said: I will note it down, but keep in mind that hardware combination might simply hit its limits when benchmarking a 4B model like Qwen3. Thanks, yeah I didn't expect much from that rig but since you've asked for a benchmark on Polaris/Vega GPUs though I share my experience. 10 minutes ago, engeldlgado said: Also you tested the experimental engine? maybe work better because it has a custom kernel for AMD. I will give it a try later and keep you posted. 1 Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851180 Share on other sites More sharing options...
XanthraX Posted Thursday at 03:44 AM Share Posted Thursday at 03:44 AM 15 hours ago, engeldlgado said: Try to use the smallest one first to test, btw, what kind of system spec you have? Qwen 4B I have both systems in my signature. Both based on CoffeeLake CPU's. RX 560 and 580. 1 Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851187 Share on other sites More sharing options...
engeldlgado Posted Thursday at 03:55 AM Author Share Posted Thursday at 03:55 AM (edited) 17 minutes ago, XanthraX said: I have both systems in my signature. Both based on CoffeeLake CPU's. RX 560 and 580. Sorry, i didnt notice because i was on the phone when i replied to you.. Your RX-580 is GCN/Polaris, not RDNA+... My AMD decode kernel is only instantiated for RDNA+ (RX-5000/6000 series) maybe others but needs further testing, so ToshLLM won't work atm on your RX-580 and 560 But I'm going to study integrating a GCN/Polaris-compatible patch. I'll need to rewrite the kernel to use 64-lane SIMD groups instead of RDNA's 32-lane simdgroups, which is more complex, but I'm interested in exploring it. I'll also study llama-metal old repo that i saw searching for this issue... to see if I can port it, to my patch to it and optimize it better for GCN GPUs. I'll update if I make progress on GCN support... would you be willing to test it when I get a working solution? Edited Thursday at 04:03 AM by engeldlgado 1 Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851188 Share on other sites More sharing options...
XanthraX Posted Thursday at 06:43 AM Share Posted Thursday at 06:43 AM Don't worry, I figured out something like this. I will try if you succeed with that update. 1 Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851196 Share on other sites More sharing options...
engeldlgado Posted Thursday at 02:15 PM Author Share Posted Thursday at 02:15 PM (edited) Hi @Cyberdevs Quick update, that i've work today: Update (v0.81.25): you can now attach files in chat — including PDFs (text is extracted automatically, and scanned PDFs are read with on-device OCR), plus more text formats. And image input for vision models is in: drop in a vision model with its mmproj (e.g. gemma-3-4b) and you can attach an image and ask about it. Vision is experimental and the image encoder runs partly on CPU on AMD GPUs (some Metal ops aren't supported), so it works but isn't fully GPU-accelerated yet. DMG is building now. Also i've add a option to change the default location for models Also may ask you for a new test on the RX Card... Update the app, and just load a model and start the server, no benchmark, anyting, just send me the logs, im researching about the VEGA/GCN Cards... Edited Thursday at 02:46 PM by engeldlgado 2 Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851202 Share on other sites More sharing options...
Alpha22 Posted Thursday at 03:08 PM Share Posted Thursday at 03:08 PM 23 hours ago, engeldlgado said: Can you test the experimental engine too in settings? it has a improvements about speed, etc Settings? 1 Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851205 Share on other sites More sharing options...
engeldlgado Posted Thursday at 04:19 PM Author Share Posted Thursday at 04:19 PM 1 hour ago, Alpha22 said: Settings? Yeah in settings... theres is an option to change the Inference Engine (llama.ccp) bundle, the experimental one, has better improvements against the normal one Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851206 Share on other sites More sharing options...
Cyberdevs Posted Thursday at 04:23 PM Share Posted Thursday at 04:23 PM @engeldlgado I will give it a try when I can and I'll post the log here. Thanks for the updates and your efforts 👍 Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851207 Share on other sites More sharing options...
Alpha22 Posted Thursday at 05:49 PM Share Posted Thursday at 05:49 PM 1 hour ago, engeldlgado said: Yeah in settings... theres is an option to change the Inference Engine (llama.ccp) bundle, the experimental one, has better improvements against the normal one Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851211 Share on other sites More sharing options...
engeldlgado Posted Thursday at 06:19 PM Author Share Posted Thursday at 06:19 PM 29 minutes ago, Alpha22 said: Excellent, it improves the performance on your end... i will take notes about your build, thanks for testing! Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851213 Share on other sites More sharing options...
Cyberdevs Posted Thursday at 06:51 PM Share Posted Thursday at 06:51 PM On my AMD RX6800XT Two top benchmarks are after enabling these settings in version Version 0.81.26 (0.81.26): 2 hours ago, engeldlgado said: Yeah in settings... theres is an option to change the Inference Engine (llama.ccp) bundle, the experimental one, has better improvements against the normal one I'll test my RX580 later and post the results. 1 Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851215 Share on other sites More sharing options...
engeldlgado Posted 14 hours ago Author Share Posted 14 hours ago (edited) @Cyberdevs That's a really great performance. I've been updating the app with new improvements. I'm more active on Reddit, but I'm still working on the issues reported here. I have a list of bugs to solve and things to improve, but I'm still checking this forum for new reports. Edited 14 hours ago by engeldlgado 2 Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851253 Share on other sites More sharing options...
jsl Posted 9 hours ago Share Posted 9 hours ago (edited) 5 hours ago, engeldlgado said: @Cyberdevs That's a really great performance. I've been updating the app with new improvements. I'm more active on Reddit, but I'm still working on the issues reported here. I have a list of bugs to solve and things to improve, but I'm still checking this forum for new reports. These new versions v.81.29 & v.81.30 working slowly in my hackintoshs X299 with RX-580 and Z690 with RX-570. Even H97M-E with RX-560 also working slowly by Benchmarks. I'll test it again with RX-6600XT and hope it could be much better ! Edited 9 hours ago by jsl 1 Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851257 Share on other sites More sharing options...
mitch_de Posted 7 hours ago Share Posted 7 hours ago will be much more better than with RX 5x0! My RX 560 only gets 1,3 /52 so even less than RX 580. RX 5600XT multi times better. only minimal diff between app versions! Normal für that GPU Type - much too old for modern GPU compute or AI. 1 Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851258 Share on other sites More sharing options...
engeldlgado Posted 2 hours ago Author Share Posted 2 hours ago (edited) 8 hours ago, jsl said: These new versions v.81.29 & v.81.30 working slowly in my hackintoshs X299 with RX-580 and Z690 with RX-570. Even H97M-E with RX-560 also working slowly by Benchmarks. I'll test it again with RX-6600XT and hope it could be much better ! Thanks for the update! It was to be expected that it would be slow. I just wanted to verify if it was possible to get coherent text instead of garbage output. Now that it's confirmed, I can look deeper into the fix. Keep an eye out for updates! Follow the issue for RX-500 cards here... https://github.com/engeldlgado/toshllm/issues/1 the RX 6600 will be much faster trust me! Edited 1 hour ago by engeldlgado Quote Link to comment https://www.insanelymac.com/forum/topic/362881-app-toshllm-%E2%80%94-local-llms-on-intel-amd-gpu-metal-amd%E2%80%91patched-llamacpp-open-source/page/2/#findComment-2851268 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.