Jump to content
7 posts in this topic

Recommended Posts

I'm starting to learn PyTorch and would eventually like to accelerate PyTorch tensor computations on a GPU.  Does anyone have experience with PyTorch on an Intel Mac equipped with Radeon dGPU?  If so ...

 

  • How would you rate a Radeon-equipped Intel Mac for learning to program with PyTorch?
  • How would you rate a Radeon-equipped Intel Mac for running accelerated PyTorch deep learning applications?
  • Should I avoid Mac/Radeon and skip to Windows/LInux with an Nvidia GPU (that I don't own)?

 

Any experienced, candid feedback is welcome.  Thank you.

Edited by deeveedee
  • Like 1

@deeveedee as far as I'm aware support for PyTorch on an Intel based Mac ended January 2024, 

https://github.com/pytorch/pytorch/issues/114602

 

The last working version of PyTorch for an Intel based Mac was 2.0.2 and that's currently working on my 4700G build. I did once have PyTorch working with a 5700X and 6700XT and it was blindingly fast in Stable diffusion, but after a fresh install of macOS and PyTorch, which at the time I didn't know about support ending and thus installed a newer version refused to work. Apparently following a fresh install it's recommended to install Python no greater than 10.5 to make sure the correct dependencies are installed for PyTorch, but I haven't managed to get it working again on the 5700X/6700XT build.

Edited by Craig Hazan
  • Like 2
  • Thanks 1
36 minutes ago, deeveedee said:

Have you moved to Windows/Linux with NVidia?

No!. I'm currently able to use PyTorch and Stable Diffusion in macOS Sequoia 15.4 running off a Ryzen 7 4700G with 16 GB VRAM, the Radeon Vega 8 graphics handle it pretty well, so it's not impossible. I have done a lot of reading about the issue though and even DeepSeek offers a clue into how they achieved their success, it simply boils down to which version of PyTorch and Python, they suggest Python 3.8.6 and PyTorch 2.2.0, the last known x86 version. I've been lucky in that I've only updated macOS and not Python or PyTorch for the past three - four years, so Stable Diffusion has always worked. The only difference I can tell from others failed installations of PyTorch is that they used pip, I used Anaconda which allowed me to install a specific version of PyTorch, where as pip will install the latest version weather you want it or not!.

Screenshot 2025-04-07 at 8.01.54 PM.png

Edited by Craig Hazan
  • Like 1

This is what DeepSeek suggests...

 

Yes! If you're running **DeepSeek on an Intel-based Mac without a GPU (CPU-only)**, here are the **optimal versions** of Python and PyTorch for best compatibility and performance:

---

### **Recommended Setup for Intel Mac (CPU-Only)**
| Component       | Recommended Version | Notes |
|----------------|---------------------|-------|
| **Python**     | **3.10 or 3.11**    | Avoid Python 3.12 (some ML libs may lack support). |
| **PyTorch**    | **2.0+ (CPU-only)** | Use the official CPU build for Intel Macs. |
| **Transformers** | `>=4.37.0`         | Required for Hugging Face model loading. |
| **RAM**        | **≥16GB** (for 7B models) | Larger models (e.g., 67B) may struggle. |

---

### **Installation Steps**
#### **1. Create a clean Python environment (recommended)**
```bash
conda create -n deepseek python=3.10 -y && conda activate deepseek
```
*(or use `python -m venv deepseek && source deepseek/bin/activate`)*

#### **2. Install PyTorch for CPU-only**
```bash
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
```
*(This ensures no accidental GPU/CUDA dependencies.)*

#### **3. Install Hugging Face libraries**
```bash
pip install transformers>=4.37.0 accelerate sentencepiece
```
*(Optional: Add `bitsandbytes` if quantizing models for memory efficiency.)*

---

### **Run DeepSeek on Intel Mac (CPU Example)**
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "deepseek-ai/deepseek-llm-7b"  # Smaller model for CPU
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="cpu")  # Force CPU

input_text = "Explain AI in simple terms."
inputs = tokenizer(input_text, return_tensors="pt").to("cpu")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

---

### **Key Optimizations for CPU-Only**
1. **Use smaller models** (e.g., `deepseek-coder-1.3b` instead of `67b`).  
2. **Enable quantization** (4-bit/8-bit) to reduce RAM usage:
   ```python
   model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)
   ```
   *(Requires `pip install bitsandbytes`)*  
3. **Limit token length** (`max_new_tokens=50-100`) to avoid slowdowns.  

---

### **Expected Performance**
- **7B model**: ~1-3 tokens/sec (depending on RAM).  
- **1.3B model**: ~5-10 tokens/sec (more usable).  
- **67B model**: Likely **too slow** without GPU (avoid unless you have 64GB+ RAM).  

---

### **Troubleshooting**
- **Out of Memory?** Try a smaller model or quantization.  
- **Slow?** Disable `device_map` and use plain `.to("cpu")`.  
- **Python 3.12 issues?** Downgrade to 3.11.  

Would you like a step-by-step guide for a specific DeepSeek model? 😊

  • Like 2
  • Thanks 1
  • 1 month later...

I am now spending most of my computer time in Ubuntu 24.04 after installing it on an old PC with the following specs:

  • Motherboard: ASUS Z170M-D3 (a socket 1151 board that still supports DDR3 and accepts Intel 6th and 7th Gen CPUs)
  • CPU: Intel i7-7700
  • RAM: 32GB DDR3-1600
  • GPU: NVidia RTX 3060 12GB

I have installed CUDA-toolkit 12.9 and Anaconda with Python virtual environments for experimentation with PyTorch.

 

I am impressed with how easy it was to install Ubuntu, NVidia Drivers and CUDA-toolkit.  Even though this system is not the latest and greatest, its performance is fantastic.

Edited by deeveedee
  • Like 2
×
×
  • Create New...