Home/Blog/Mobile Proxy for Scrapy
Tool Integration

Mobile Proxy for Scrapy

Scrapy's built-in HttpProxyMiddleware reads request.meta['proxy'] and routes the outbound socket. Wire it up with a custom downloader middleware that injects the mobile proxy URL and calls our switch API when responses degrade.

8 min read·Python Scraping·Last updated: May 2026

Prerequisites

  • Python 3.10+ with scrapy and requests installed.
  • Mobile proxy slot and API key from mobileproxies.org.
  • An existing Scrapy project, or scaffold one with scrapy startproject myscraper.

Step-by-Step Configuration

STEP 01

Move credentials to environment variables

# .env (loaded with python-dotenv in settings.py)
MP_HOST=proxy.mobileproxies.org
MP_PORT=8000
MP_USER=u_4a9c
MP_PASS=p_2X7q...
MP_API_KEY=YOUR_API_KEY
MP_SLOT=us-mob-01
STEP 02

settings.py

# settings.py
import os
from dotenv import load_dotenv
load_dotenv()

BOT_NAME = "myscraper"

# Be polite — mobile slot is a single carrier IP
CONCURRENT_REQUESTS = 4
CONCURRENT_REQUESTS_PER_DOMAIN = 2
DOWNLOAD_DELAY = 1.5
RANDOMIZE_DOWNLOAD_DELAY = True
RETRY_TIMES = 3
RETRY_HTTP_CODES = [429, 500, 502, 503, 504, 522, 524, 408]

# Cookies persist across requests through the same proxy
COOKIES_ENABLED = True
USER_AGENT = ("Mozilla/5.0 (iPhone; CPU iPhone OS 17_5 like Mac OS X) "
              "AppleWebKit/605.1.15 Mobile/15E148 Safari/604.1")

DOWNLOADER_MIDDLEWARES = {
    # Inject the proxy URL into every request
    "myscraper.middlewares.MobileProxyMiddleware": 350,
    # Stock HttpProxyMiddleware reads request.meta["proxy"]
    "scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware": 400,
    # React to bans and trigger rotation
    "myscraper.middlewares.RotateOnBanMiddleware": 410,
}

MP = {
    "host":  os.environ["MP_HOST"],
    "port":  int(os.environ["MP_PORT"]),
    "user":  os.environ["MP_USER"],
    "pass":  os.environ["MP_PASS"],
    "api_key": os.environ["MP_API_KEY"],
    "slot":  os.environ["MP_SLOT"],
}
STEP 03

middlewares.py — proxy injection

# myscraper/middlewares.py
from urllib.parse import quote

class MobileProxyMiddleware:
    @classmethod
    def from_crawler(cls, crawler):
        return cls(crawler.settings.get("MP"))

    def __init__(self, mp):
        self.proxy = (
            f"http://{quote(mp['user'])}:{quote(mp['pass'])}"
            f"@{mp['host']}:{mp['port']}"
        )

    def process_request(self, request, spider):
        request.meta["proxy"] = self.proxy
        # Force tunnel even for plaintext HTTP
        request.meta.setdefault("download_timeout", 30)
STEP 04

middlewares.py — rotation on ban

# myscraper/middlewares.py (continued)
import time, requests, threading

class RotateOnBanMiddleware:
    BAN_CODES = {403, 429, 503}
    COOLDOWN = 30  # seconds — don't hammer the switch endpoint

    @classmethod
    def from_crawler(cls, crawler):
        return cls(crawler.settings.get("MP"))

    def __init__(self, mp):
        self.mp = mp
        self._last_rotate = 0
        self._lock = threading.Lock()

    def process_response(self, request, response, spider):
        if response.status in self.BAN_CODES:
            self._rotate(spider)
            # Retry this request after rotation
            return request.replace(dont_filter=True)
        return response

    def _rotate(self, spider):
        with self._lock:
            if time.time() - self._last_rotate < self.COOLDOWN:
                return
            self._last_rotate = time.time()
            r = requests.post(
                f"https://buy.mobileproxies.org/api/v1/proxies/{self.mp['slot']}/switch",
                headers={"Authorization": f"Bearer {self.mp['api_key']}"},
                timeout=10,
            )
            spider.logger.info(f"rotate → {r.status_code}")
            time.sleep(4)  # let the new IP bind
STEP 05

Spider — example using both

# myscraper/spiders/ip_check.py
import scrapy

class IpCheckSpider(scrapy.Spider):
    name = "ip_check"
    start_urls = ["https://api.ipify.org?format=json"] * 5

    def parse(self, response):
        yield {"egress_ip": response.json()["ip"]}

Run: scrapy crawl ip_check -o ips.jsonl — confirm the IPs are carrier-owned.

Verify It Works

The 5 requests in start_urls should all return the same mobile IP within a single spider run (sticky session). Trigger a rotation manually mid-run and the next request should show a different IP — both from carrier ASNs, never from datacenter ranges.

Pool of Slots (Higher Throughput)

One mobile slot caps at a few requests/sec. For higher throughput, allocate several slots and round-robin them in the proxy middleware:

# Variation: cycle through a list of slots
import itertools

class PoolProxyMiddleware:
    def __init__(self, slots):
        self.cycle = itertools.cycle([
            f"http://{s['user']}:{s['pass']}@{s['host']}:{s['port']}"
            for s in slots
        ])
    def process_request(self, request, spider):
        request.meta["proxy"] = next(self.cycle)

Common Errors

"TunnelError: Could not open CONNECT tunnel"

Auth failure or wrong port. Run a quick requests.get(..., proxies=...) sanity check outside Scrapy to isolate the credentials issue.

Cookies don't persist across rotations

Expected — a new egress IP looks like a new visitor. If you need session continuity, do the work on one IP and rotate between sessions, not within them.

RotateOnBanMiddleware hammers the switch endpoint

Bumps in COOLDOWN aren't enough under burst load. Use a process-wide rotation lock (e.g. Redis SETNX) so multiple Scrapy workers can't all rotate at once.

Related Guides

Scrapy + Mobile IPs at Scale

$5 trial. Drop the middleware in, run your spiders against carrier IPs, rotate on the API.