How to Run Phi 4 on Raspberry Pi: Step-by-Step Guide (2026)

Running Phi 4 on Raspberry Pi 5 opens a world of edge AI possibilities. With 8GB of RAM and native ARM support, you can deploy a capable language model on a credit-card-sized device—perfect for robots, IoT systems, offline devices, and distributed edge networks. This guide shows you how to set up Phi 4 on Raspberry Pi and leverage it for real-world applications.

Why Run Phi 4 on Raspberry Pi?

Phi 4 is compact and efficient—ideal for edge deployment. Running it on Raspberry Pi 5 unlocks:

Offline-first AI: Deploy language model intelligence in remote locations without internet
Robotics integration: Add natural language understanding to physical robots and embedded systems
IoT applications: Build smart devices that make local decisions without cloud latency
Privacy at the edge: Process sensitive data on the device; nothing leaves your network
Cost-effective: Raspberry Pi 5 costs ~$60; far cheaper than cloud API subscriptions
Distributed networks: Deploy Phi 4 on hundreds of Pi devices for edge computing clusters

Hardware Requirements

Minimum setup:

Raspberry Pi 5 (8GB recommended; 4GB may struggle)
64GB microSD card or NVMe SSD
USB-C power supply (27W recommended)
Cooling case (Phi 4 inference is CPU-intensive; avoid thermal throttling)

Optional but recommended:

USB-to-Ethernet for stable networking (faster than WiFi for model downloads)
External SSD (NVMe faster than SD card for model storage)

Step 1: Prepare Your Raspberry Pi

Install the latest Raspberry Pi OS (64-bit) using the official imager. Download Raspberry Pi Imager from raspberrypi.com, select OS → Raspberry Pi OS (64-bit), and write to your SD card or SSD.

Boot your Pi and open a terminal. Update the system:

sudo apt update && sudo apt upgrade -y
sudo apt install -y curl wget git python3-pip python3-venv

Verify your setup:

uname -a  # Should show ARM64 architecture

Step 2: Increase Swap Space (Critical!)

Phi 4 requires more virtual memory than Raspberry Pi has available. Increase swap to prevent crashes:

sudo dphys-swapfile swapoff
sudo nano /etc/dphys-swapfile

Find this line:

CONF_SWAPSIZE=100

Change it to:

CONF_SWAPSIZE=4096  # 4GB swap; adjust based on your drive size

Save (Ctrl+O, Enter, Ctrl+X). Then:

sudo dphys-swapfile setup
sudo dphys-swapfile swapon
free -h  # Verify swap is active

Step 3: Install Ollama on Raspberry Pi

Ollama's install script works on Raspberry Pi ARM64:

curl -fsSL https://ollama.com/install.sh | sh

This installs Ollama and registers it as a systemd service. Verify:

ollama --version

Step 4: Download Phi 4 (Optimized for ARM)

Ollama automatically selects the ARM64-optimized Phi 4 version. Pull it:

ollama pull phi4

This downloads ~8.5GB. On a USB-connected Ethernet link, expect 10–20 minutes. Monitor progress:

df -h /  # Check remaining disk space
free -h  # Watch memory usage

If downloads are slow over WiFi, connect Ethernet for faster transfers.

Step 5: Run Phi 4 on Your Raspberry Pi

Start the Ollama server:

ollama serve &

This runs in the background. In a new terminal, test Phi 4:

curl -X POST http://localhost:11434/api/generate \
  -d '{
    "model": "phi4",
    "prompt": "Hello, what is your name?",
    "stream": false
  }' | jq '.response'

Expect a response in 30–120 seconds (slower than modern desktops, but functional for edge tasks). First-token latency is typically 15–30 seconds on Raspberry Pi 5.

Step 6: Integrate Phi 4 Into Your Raspberry Pi Projects

Python automation script:

#!/usr/bin/env python3
import requests

def ask_phi4(prompt):
    response = requests.post('http://localhost:11434/api/generate', json={
        'model': 'phi4',
        'prompt': prompt,
        'stream': False
    })
    return response.json()['response']

# Example: Device monitoring
status = ask_phi4("Summarize: Device temperature is 45C, RAM usage is 70%, CPU at 60%")
print(f"System status: {status}")

Bash automation for IoT tasks:

#!/bin/bash
sensor_data=$(cat /sys/class/thermal/thermal_zone0/temp)
response=$(curl -s -X POST http://localhost:11434/api/generate \
  -d "{\"model\": \"phi4\", \"prompt\": \"Device temperature is $(($sensor_data/1000))C. Should I activate cooling?\", \"stream\": false}" \
  | jq -r '.response')
echo "Decision: $response"

Optimizing Phi 4 for Raspberry Pi Performance

Reduce context window for faster responses:

ollama run phi4 --num-ctx 512

This limits the model to 512 tokens of context, cutting inference time in half. Ideal for quick decisions in IoT scenarios.

Limit threads to prevent CPU overload:

export OLLAMA_NUM_THREAD=2  # Use only 2 CPU cores
ollama serve

This prevents thermal throttling and allows other processes to run smoothly.

Monitor CPU and memory in real-time:

watch -n 1 'free -h && ps aux | grep ollama'

Real-World Raspberry Pi + Phi 4 Applications

Smart home automation:

Deploy Phi 4 on Raspberry Pi to interpret natural language commands—"Turn on the lights in 5 minutes"—without sending data to cloud services.

Robot intelligence:

Equip a Raspberry Pi-powered robot with Phi 4 for local language understanding. The robot understands commands, asks clarifying questions, and makes decisions without internet.

Offline edge analytics:

Process sensor streams and generate insights locally. Example: Analyze weather data and generate a local weather forecast without API calls.

Distributed edge network:

Deploy Phi 4 across 10–100 Raspberry Pis to create a decentralized inference network. Aggregate results locally without central servers.

Troubleshooting Raspberry Pi + Phi 4

Issue: "Out of memory" or model fails to load

Solution: Verify swap is active (free -h) and set to 4GB+. Reduce context window: ollama run phi4 --num-ctx 256

Issue: Phi 4 responses take 2+ minutes

Solution: This is normal on Raspberry Pi 5. If unacceptable, consider using a smaller model (e.g., "ollama pull neural-chat"). Alternatively, upgrade to multiple Pi 5s for parallel inference.

Issue: Ollama service crashes or doesn't start

Solution: Check logs: journalctl -u ollama -n 50. Restart service: sudo systemctl restart ollama

Issue: WiFi downloads are extremely slow

Solution: Use wired Ethernet with a USB adapter. WiFi on Raspberry Pi 5 peaks at ~100 Mbps; Ethernet can achieve 1 Gbps.

Going Further: Edge AI Clusters

Once you master Phi 4 on a single Raspberry Pi, scale to edge clusters:

Deploy Phi 4 on 10–100 Pi devices across geographic locations
Use distributed inference to handle high query volumes
Add a coordinator to distribute load and aggregate results
Monitor the entire network with Prometheus + Grafana

🔗

Build intelligent edge networks with local AI. Daily AI Agents provides frameworks for deploying, coordinating, and monitoring distributed Phi 4 clusters. Start your edge AI journey today.

Explore Edge AI Systems →

Conclusion

Phi 4 on Raspberry Pi democratizes edge AI. For under $100 in hardware, you gain a capable language model that operates offline, respects privacy, and scales horizontally across devices. From smart home automation to robot intelligence to distributed sensor networks, Phi 4 on Raspberry Pi unlocks applications previously reserved for expensive cloud infrastructure.

Deploy local AI at the edge. Start with a single Pi today.