Bing, Yandex & Baidu SERP Scraping
Google handles ~73% of global search. The other ~27% — Bing in the US/EU, Yandex in Russia, Baidu in China, DuckDuckGo everywhere — is a blind spot most SEO tools ignore. Here's what changes when you scrape them.
1. Why Scrape Non-Google Engines
Non-Google traffic matters for three reasons: (1) regional intelligence — Yandex and Baidu are primary engines in their markets; (2) diversification — Bing and DuckDuckGo together ship traffic to B2B, enterprise, and privacy-conscious audiences; (3) AI training — Bing results power ChatGPT/Copilot, and Yandex/Baidu feed local LLMs. If you only track Google, you're ranking blind in meaningful slices of the market.
| Engine | Primary markets | Proxy region needed |
|---|---|---|
| Bing | US, UK, Germany (secondary engine) | Any — tolerant to clean US/EU IPs |
| Yandex | Russia (primary), CIS states | Russian IPs for accurate local ranking |
| Baidu | China (primary) | Chinese IPs — foreign IPs return filtered results |
| DuckDuckGo | Global (privacy audience) | Any clean IP |
2. Bing — the Easiest Non-Google
Bing is the most forgiving major engine to scrape. Microsoft also operates a paid Bing Web Search API (priced per thousand transactions) but for most SEO workflows the free scraping path is sufficient. Key selectors:
- →
li.b_algo— organic result block - →
h2 a— title + URL - →
div.b_caption p— snippet text - →URL pattern:
https://www.bing.com/search?q=query&first=11(pagination usesfirst=, notstart=) - →Market/locale:
&cc=us,&setlang=en
3. Bing Scraper (Python)
import requests
from bs4 import BeautifulSoup
from urllib.parse import quote_plus
proxies = {
"http": "http://user:pass@proxy.mobileproxies.org:8000",
"https": "http://user:pass@proxy.mobileproxies.org:8000",
}
HEADERS = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
"(KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36",
"Accept-Language": "en-US,en;q=0.9",
}
def search_bing(query, cc="us", setlang="en", first=1):
url = (
f"https://www.bing.com/search?q={quote_plus(query)}"
f"&cc={cc}&setlang={setlang}&first={first}"
)
r = requests.get(url, headers=HEADERS, proxies=proxies, timeout=20)
r.raise_for_status()
soup = BeautifulSoup(r.text, "lxml")
out = []
for block in soup.select("li.b_algo"):
title_el = block.select_one("h2 a")
snippet_el = block.select_one("div.b_caption p")
if not title_el:
continue
out.append({
"title": title_el.get_text(strip=True),
"url": title_el.get("href"),
"snippet": snippet_el.get_text(strip=True) if snippet_el else None,
})
return out
if __name__ == "__main__":
for row in search_bing("mobile proxies"):
print(row["title"][:60], "-", row["url"][:50])
Bing accepts a Chrome desktop User-Agent better than an iPhone one — its anti-bot is noticeably tuned for mobile spam, so a clean desktop fingerprint survives longer.
4. Yandex — Hardest on Foreign IPs
Yandex holds ~60% of Russian desktop search share. Its anti-bot is aggressive — SmartCaptcha (Yandex's own challenge) fires on suspicious sessions after as few as 10-20 queries from a flagged IP. Selectors:
- →
li.serp-item— organic result block - →
div.organic__url-text— display URL - →
div.text-container— snippet - →URL:
https://yandex.ru/search/?text=query&p=0(paginationp=0,1,2...)
Russian IPs required for accurate local SERP. Yandex serves different results to IPs outside Russia — often a degraded/filtered set. Mobileproxies.org covers US, UK, Germany, France, Netherlands, Italy, Spain, Poland, Canada, Brazil, India, Japan, and Australia as carrier geographies; Russia is not in the current pool. For Yandex rank tracking specifically, use a Russia-native provider or the Yandex XML API (paid).
5. Baidu — China-Only, JavaScript-Heavy
Baidu commands ~60% of Chinese search. It blocks non-Chinese IPs aggressively, partly for anti-scraping reasons and partly because GFW rules shape what Chinese users can see anyway — the SERP served to a US IP is not the SERP a Beijing user sees. Selectors:
- →
div.c-container— organic result block - →
h3.t a— title + link (links arebaidu.com/link?url=...redirects) - →URL:
https://www.baidu.com/s?wd=query&pn=10(paginationpn=0,10,20...) - →Baidu links go through a redirector — to get the real URL you need a second HEAD request, or parse the signed
muattribute
6. DuckDuckGo — Easiest to Scrape
DuckDuckGo doesn't crawl the web itself — it federates results from 400+ sources, with the organic ranking coming mostly from Bing. That makes DDG a cheaper way to approximate Bing ranking if you hit rate limits on Bing itself. Two endpoints:
- →HTML version:
https://html.duckduckgo.com/html/?q=query— renders static HTML, perfect for requests + BeautifulSoup - →Instant Answer API:
https://api.duckduckgo.com/?q=query&format=json— returns abstracts and topics, not full SERP - →Result selector:
div.result, titlea.result__a, snippeta.result__snippet
Related Guides
Global SERP Coverage
Mobile IPs in 13 carrier geographies — tune your SERP pipeline to the market you actually care about.