[{"data":1,"prerenderedAt":1681},["ShallowReactive",2],{"home-latest-posts":3},[4,687,1136],{"id":5,"title":6,"body":7,"date":666,"description":667,"draft":668,"extension":669,"meta":670,"navigation":149,"path":671,"readingTime":672,"seo":673,"stem":674,"tags":675,"takeaways":680,"updated":685,"__hash__":686},"blog\u002Fblog\u002Frotating-proxies-for-web-scraping.md","How to Integrate Rotating Proxies for Web Scraping (Without Getting Blocked)",{"type":8,"value":9,"toc":655},"minimark",[10,14,23,28,31,35,38,113,120,124,127,209,217,221,224,370,384,388,391,475,486,490,493,525,528,532,535,589,592,596,631,635,651],[11,12,6],"h1",{"id":13},"how-to-integrate-rotating-proxies-for-web-scraping-without-getting-blocked",[15,16,17,18,22],"p",{},"If your scraper works for the first hundred requests and then starts returning ",[19,20,21],"code",{},"403",", empty pages, or CAPTCHAs, you have an IP reputation problem, not a code problem. The fix is rotating proxies. This guide covers how to choose the right proxy type, integrate it into a Python scraper, and build the rotation and retry logic that keeps a job running at scale.",[24,25,27],"h2",{"id":26},"why-a-single-ip-gets-blocked","Why a single IP gets blocked",[15,29,30],{},"Every request you send carries your IP address. Anti-bot systems (Cloudflare, DataDome, Akamai, PerimeterX) track request volume, timing, and behavior per IP. A datacenter IP sending 500 requests a minute to a product page looks nothing like a human, so it gets rate-limited or banned. Rotating proxies spread your requests across many IPs so no single address crosses the threshold.",[24,32,34],{"id":33},"proxy-types-and-when-to-use-each","Proxy types, and when to use each",[15,36,37],{},"There are three categories, and picking the wrong one is the most common reason a scrape fails.",[39,40,41,60],"table",{},[42,43,44],"thead",{},[45,46,47,51,54,57],"tr",{},[48,49,50],"th",{},"Type",[48,52,53],{},"Cost",[48,55,56],{},"Detection risk",[48,58,59],{},"Best for",[61,62,63,81,97],"tbody",{},[45,64,65,72,75,78],{},[66,67,68],"td",{},[69,70,71],"strong",{},"Datacenter",[66,73,74],{},"Cheapest",[66,76,77],{},"High",[66,79,80],{},"Unprotected sites, internal tools, high volume where bans are cheap",[45,82,83,88,91,94],{},[66,84,85],{},[69,86,87],{},"Residential",[66,89,90],{},"Mid to high",[66,92,93],{},"Low",[66,95,96],{},"E-commerce, sites behind Cloudflare\u002FDataDome",[45,98,99,104,107,110],{},[66,100,101],{},[69,102,103],{},"Mobile (4G\u002F5G)",[66,105,106],{},"Highest",[66,108,109],{},"Lowest",[66,111,112],{},"The hardest targets like Instagram, sneaker sites, aggressive WAFs",[15,114,115,116,119],{},"Rule of thumb: ",[69,117,118],{},"start with datacenter, escalate to residential only when you see blocks."," Paying for residential on a site that doesn't need it just burns budget.",[24,121,123],{"id":122},"basic-integration-in-python-requests","Basic integration in Python (requests)",[15,125,126],{},"Most providers give you a single gateway endpoint that rotates the IP for you on every request:",[128,129,134],"pre",{"className":130,"code":131,"language":132,"meta":133,"style":133},"language-python shiki shiki-themes github-light github-dark","import requests\n\nPROXY = \"http:\u002F\u002FUSER:PASS@gateway.provider.com:7000\"\n\nproxies = {\"http\": PROXY, \"https\": PROXY}\n\nresp = requests.get(\n    \"https:\u002F\u002Fexample.com\u002Fproducts\",\n    proxies=proxies,\n    timeout=20,\n)\nprint(resp.status_code, resp.url)\n","python","",[19,135,136,144,151,157,162,168,173,179,185,191,197,203],{"__ignoreMap":133},[137,138,141],"span",{"class":139,"line":140},"line",1,[137,142,143],{},"import requests\n",[137,145,147],{"class":139,"line":146},2,[137,148,150],{"emptyLinePlaceholder":149},true,"\n",[137,152,154],{"class":139,"line":153},3,[137,155,156],{},"PROXY = \"http:\u002F\u002FUSER:PASS@gateway.provider.com:7000\"\n",[137,158,160],{"class":139,"line":159},4,[137,161,150],{"emptyLinePlaceholder":149},[137,163,165],{"class":139,"line":164},5,[137,166,167],{},"proxies = {\"http\": PROXY, \"https\": PROXY}\n",[137,169,171],{"class":139,"line":170},6,[137,172,150],{"emptyLinePlaceholder":149},[137,174,176],{"class":139,"line":175},7,[137,177,178],{},"resp = requests.get(\n",[137,180,182],{"class":139,"line":181},8,[137,183,184],{},"    \"https:\u002F\u002Fexample.com\u002Fproducts\",\n",[137,186,188],{"class":139,"line":187},9,[137,189,190],{},"    proxies=proxies,\n",[137,192,194],{"class":139,"line":193},10,[137,195,196],{},"    timeout=20,\n",[137,198,200],{"class":139,"line":199},11,[137,201,202],{},")\n",[137,204,206],{"class":139,"line":205},12,[137,207,208],{},"print(resp.status_code, resp.url)\n",[15,210,211,212,216],{},"This is the simplest setup: the provider's gateway hands you a fresh IP per request. It works, but it gives you no control over ",[213,214,215],"em",{},"when"," to rotate or how to react to a ban.",[24,218,220],{"id":219},"manual-rotation-with-a-proxy-pool","Manual rotation with a proxy pool",[15,222,223],{},"When you need control, for instance keeping the same IP across a multi-step login flow before rotating, manage the pool yourself:",[128,225,227],{"className":130,"code":226,"language":132,"meta":133,"style":133},"import random\nimport requests\n\nPROXY_POOL = [\n    \"http:\u002F\u002FUSER:PASS@p1.provider.com:8000\",\n    \"http:\u002F\u002FUSER:PASS@p2.provider.com:8000\",\n    \"http:\u002F\u002FUSER:PASS@p3.provider.com:8000\",\n]\n\ndef fetch(url: str, max_retries: int = 3) -> requests.Response | None:\n    tried = set()\n    for _ in range(max_retries):\n        proxy = random.choice([p for p in PROXY_POOL if p not in tried])\n        tried.add(proxy)\n        try:\n            resp = requests.get(\n                url,\n                proxies={\"http\": proxy, \"https\": proxy},\n                timeout=20,\n            )\n            if resp.status_code == 200:\n                return resp\n            # 403\u002F429 → this IP is burned, rotate\n        except requests.RequestException:\n            continue  # dead proxy, try the next one\n    return None\n",[19,228,229,234,238,242,247,252,257,262,267,271,276,281,286,292,298,304,310,316,322,328,334,340,346,352,358,364],{"__ignoreMap":133},[137,230,231],{"class":139,"line":140},[137,232,233],{},"import random\n",[137,235,236],{"class":139,"line":146},[137,237,143],{},[137,239,240],{"class":139,"line":153},[137,241,150],{"emptyLinePlaceholder":149},[137,243,244],{"class":139,"line":159},[137,245,246],{},"PROXY_POOL = [\n",[137,248,249],{"class":139,"line":164},[137,250,251],{},"    \"http:\u002F\u002FUSER:PASS@p1.provider.com:8000\",\n",[137,253,254],{"class":139,"line":170},[137,255,256],{},"    \"http:\u002F\u002FUSER:PASS@p2.provider.com:8000\",\n",[137,258,259],{"class":139,"line":175},[137,260,261],{},"    \"http:\u002F\u002FUSER:PASS@p3.provider.com:8000\",\n",[137,263,264],{"class":139,"line":181},[137,265,266],{},"]\n",[137,268,269],{"class":139,"line":187},[137,270,150],{"emptyLinePlaceholder":149},[137,272,273],{"class":139,"line":193},[137,274,275],{},"def fetch(url: str, max_retries: int = 3) -> requests.Response | None:\n",[137,277,278],{"class":139,"line":199},[137,279,280],{},"    tried = set()\n",[137,282,283],{"class":139,"line":205},[137,284,285],{},"    for _ in range(max_retries):\n",[137,287,289],{"class":139,"line":288},13,[137,290,291],{},"        proxy = random.choice([p for p in PROXY_POOL if p not in tried])\n",[137,293,295],{"class":139,"line":294},14,[137,296,297],{},"        tried.add(proxy)\n",[137,299,301],{"class":139,"line":300},15,[137,302,303],{},"        try:\n",[137,305,307],{"class":139,"line":306},16,[137,308,309],{},"            resp = requests.get(\n",[137,311,313],{"class":139,"line":312},17,[137,314,315],{},"                url,\n",[137,317,319],{"class":139,"line":318},18,[137,320,321],{},"                proxies={\"http\": proxy, \"https\": proxy},\n",[137,323,325],{"class":139,"line":324},19,[137,326,327],{},"                timeout=20,\n",[137,329,331],{"class":139,"line":330},20,[137,332,333],{},"            )\n",[137,335,337],{"class":139,"line":336},21,[137,338,339],{},"            if resp.status_code == 200:\n",[137,341,343],{"class":139,"line":342},22,[137,344,345],{},"                return resp\n",[137,347,349],{"class":139,"line":348},23,[137,350,351],{},"            # 403\u002F429 → this IP is burned, rotate\n",[137,353,355],{"class":139,"line":354},24,[137,356,357],{},"        except requests.RequestException:\n",[137,359,361],{"class":139,"line":360},25,[137,362,363],{},"            continue  # dead proxy, try the next one\n",[137,365,367],{"class":139,"line":366},26,[137,368,369],{},"    return None\n",[15,371,372,373,383],{},"The key ideas: ",[69,374,375,376,378,379,382],{},"track which proxies you've already tried for a given request, treat ",[19,377,21],{},"\u002F",[19,380,381],{},"429"," as a signal to rotate, and silently skip dead proxies."," Without retry logic, a single bad IP fails the whole job.",[24,385,387],{"id":386},"proxies-with-a-headless-browser-playwright","Proxies with a headless browser (Playwright)",[15,389,390],{},"For JavaScript-rendered sites you need a real browser. Playwright takes a proxy per context, which lets you isolate sessions:",[128,392,394],{"className":130,"code":393,"language":132,"meta":133,"style":133},"from playwright.async_api import async_playwright\n\nasync def scrape(url: str, proxy: str):\n    async with async_playwright() as p:\n        browser = await p.chromium.launch(\n            proxy={\n                \"server\": \"http:\u002F\u002Fgateway.provider.com:7000\",\n                \"username\": \"USER\",\n                \"password\": \"PASS\",\n            },\n        )\n        page = await browser.new_page()\n        await page.goto(url, wait_until=\"networkidle\")\n        html = await page.content()\n        await browser.close()\n        return html\n",[19,395,396,401,405,410,415,420,425,430,435,440,445,450,455,460,465,470],{"__ignoreMap":133},[137,397,398],{"class":139,"line":140},[137,399,400],{},"from playwright.async_api import async_playwright\n",[137,402,403],{"class":139,"line":146},[137,404,150],{"emptyLinePlaceholder":149},[137,406,407],{"class":139,"line":153},[137,408,409],{},"async def scrape(url: str, proxy: str):\n",[137,411,412],{"class":139,"line":159},[137,413,414],{},"    async with async_playwright() as p:\n",[137,416,417],{"class":139,"line":164},[137,418,419],{},"        browser = await p.chromium.launch(\n",[137,421,422],{"class":139,"line":170},[137,423,424],{},"            proxy={\n",[137,426,427],{"class":139,"line":175},[137,428,429],{},"                \"server\": \"http:\u002F\u002Fgateway.provider.com:7000\",\n",[137,431,432],{"class":139,"line":181},[137,433,434],{},"                \"username\": \"USER\",\n",[137,436,437],{"class":139,"line":187},[137,438,439],{},"                \"password\": \"PASS\",\n",[137,441,442],{"class":139,"line":193},[137,443,444],{},"            },\n",[137,446,447],{"class":139,"line":199},[137,448,449],{},"        )\n",[137,451,452],{"class":139,"line":205},[137,453,454],{},"        page = await browser.new_page()\n",[137,456,457],{"class":139,"line":288},[137,458,459],{},"        await page.goto(url, wait_until=\"networkidle\")\n",[137,461,462],{"class":139,"line":294},[137,463,464],{},"        html = await page.content()\n",[137,466,467],{"class":139,"line":300},[137,468,469],{},"        await browser.close()\n",[137,471,472],{"class":139,"line":306},[137,473,474],{},"        return html\n",[15,476,477,478,481,482,485],{},"One critical detail: ",[69,479,480],{},"match your proxy's geolocation to the site's expected audience."," Scraping a US retailer through a German residential IP often triggers extra verification. Most residential providers let you pin a country (",[19,483,484],{},"gateway.provider.com:7000?country=us",").",[24,487,489],{"id":488},"combining-proxies-with-fingerprint-stealth","Combining proxies with fingerprint stealth",[15,491,492],{},"Rotating IPs alone is not enough on aggressively protected sites. A fresh residential IP paired with an obvious headless-Chrome fingerprint still gets flagged. The full stack looks like:",[494,495,496,503,513,519],"ol",{},[497,498,499,502],"li",{},[69,500,501],{},"Residential\u002Fmobile proxy"," for a clean IP reputation.",[497,504,505,508,509,512],{},[69,506,507],{},"Fingerprint spoofing"," with realistic ",[19,510,511],{},"navigator"," properties, WebGL, canvas, fonts.",[497,514,515,518],{},[69,516,517],{},"Human-like timing"," using randomized delays, no perfectly even request intervals.",[497,520,521,524],{},[69,522,523],{},"Session persistence"," that reuses cookies and the same IP within a logical session, rotating between sessions.",[15,526,527],{},"Skip any one layer and the others can't compensate. This is why \"just add proxies\" often fails on Cloudflare-protected targets: the IP was clean, but the fingerprint gave it away.",[24,529,531],{"id":530},"a-retry-pattern-that-survives-real-jobs","A retry pattern that survives real jobs",[15,533,534],{},"In production I wrap every request in exponential backoff with proxy rotation on hard failures:",[128,536,538],{"className":130,"code":537,"language":132,"meta":133,"style":133},"import time\n\ndef fetch_with_backoff(url: str, max_attempts: int = 5):\n    for attempt in range(max_attempts):\n        resp = fetch(url)  # rotates proxy internally\n        if resp is not None:\n            return resp\n        sleep = min(2 ** attempt, 30)  # cap backoff at 30s\n        time.sleep(sleep)\n    raise RuntimeError(f\"Failed after {max_attempts} attempts: {url}\")\n",[19,539,540,545,549,554,559,564,569,574,579,584],{"__ignoreMap":133},[137,541,542],{"class":139,"line":140},[137,543,544],{},"import time\n",[137,546,547],{"class":139,"line":146},[137,548,150],{"emptyLinePlaceholder":149},[137,550,551],{"class":139,"line":153},[137,552,553],{},"def fetch_with_backoff(url: str, max_attempts: int = 5):\n",[137,555,556],{"class":139,"line":159},[137,557,558],{},"    for attempt in range(max_attempts):\n",[137,560,561],{"class":139,"line":164},[137,562,563],{},"        resp = fetch(url)  # rotates proxy internally\n",[137,565,566],{"class":139,"line":170},[137,567,568],{},"        if resp is not None:\n",[137,570,571],{"class":139,"line":175},[137,572,573],{},"            return resp\n",[137,575,576],{"class":139,"line":181},[137,577,578],{},"        sleep = min(2 ** attempt, 30)  # cap backoff at 30s\n",[137,580,581],{"class":139,"line":187},[137,582,583],{},"        time.sleep(sleep)\n",[137,585,586],{"class":139,"line":193},[137,587,588],{},"    raise RuntimeError(f\"Failed after {max_attempts} attempts: {url}\")\n",[15,590,591],{},"Exponential backoff prevents you from hammering a site that's already rate-limiting you, which on some WAFs escalates a soft block into a hard ban.",[24,593,595],{"id":594},"common-mistakes-to-avoid","Common mistakes to avoid",[597,598,599,609,619,625],"ul",{},[497,600,601,604,605,608],{},[69,602,603],{},"Rotating too aggressively."," A new IP on every single request can look ",[213,606,607],{},"more"," suspicious than a stable session. Match rotation to the site's tolerance.",[497,610,611,614,615,618],{},[69,612,613],{},"Ignoring response bodies."," A ",[19,616,617],{},"200"," status with a CAPTCHA page in the body is still a block. Validate content, not just status codes.",[497,620,621,624],{},[69,622,623],{},"Leaking your real IP."," WebRTC, DNS, and direct API calls can bypass the proxy. Test with an IP-check endpoint before trusting your setup.",[497,626,627,630],{},[69,628,629],{},"Buying the cheapest residential pool."," Oversold pools have burned IPs already flagged across thousands of sites.",[24,632,634],{"id":633},"need-this-built-for-your-project","Need this built for your project?",[15,636,637,638,645,646,650],{},"I build production scraping systems with proxy integration, anti-bot bypass, and the retry infrastructure to keep them running at scale, across Cloudflare, DataDome, and Akamai-protected sites. If you have a scraping or automation project, ",[639,640,644],"a",{"href":641,"rel":642},"https:\u002F\u002Fwww.upwork.com\u002Ffreelancers\u002Fphanvuong2",[643],"nofollow","hire me on Upwork"," or get in touch through the ",[639,647,649],{"href":648},"\u002F#contact","contact form",". I reply within 24 hours with a scope and quote.",[652,653,654],"style",{},"html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}",{"title":133,"searchDepth":146,"depth":146,"links":656},[657,658,659,660,661,662,663,664,665],{"id":26,"depth":146,"text":27},{"id":33,"depth":146,"text":34},{"id":122,"depth":146,"text":123},{"id":219,"depth":146,"text":220},{"id":386,"depth":146,"text":387},{"id":488,"depth":146,"text":489},{"id":530,"depth":146,"text":531},{"id":594,"depth":146,"text":595},{"id":633,"depth":146,"text":634},"2026-06-12","A practical guide to integrating residential and rotating proxies into a Python scraper: proxy types, rotation strategies, retry logic, and how to avoid IP bans on protected sites.",false,"md",{},"\u002Fblog\u002Frotating-proxies-for-web-scraping","8 min read",{"title":6,"description":667},"blog\u002Frotating-proxies-for-web-scraping",[676,677,132,678,679],"web scraping","proxies","anti-bot","playwright",[681,682,683,684],"Datacenter proxies are cheapest but blocked fast; residential and mobile cost more but pass protected sites.","Start with datacenter and escalate to residential only when you actually see blocks.","Treat 403 and 429 responses as a signal to rotate, and silently skip dead proxies.","Match proxy geolocation to the site's audience, and pair proxies with fingerprint stealth and human-like timing.",null,"1Ocj1ZSzLA0gcR97EZs8BvMRpVTWAsngaK8NHENlbtM",{"id":688,"title":689,"body":690,"date":1121,"description":1122,"draft":668,"extension":669,"meta":1123,"navigation":149,"path":1124,"readingTime":1125,"seo":1126,"stem":1127,"tags":1128,"takeaways":1130,"updated":685,"__hash__":1135},"blog\u002Fblog\u002Fbypass-cloudflare-web-scraping.md","How to Scrape Cloudflare-Protected Sites in 2026 (A Practical Approach)",{"type":8,"value":691,"toc":1111},[692,695,701,705,708,751,758,762,776,822,829,833,840,970,984,988,995,1002,1006,1009,1023,1026,1030,1036,1081,1088,1092,1095,1099,1109],[11,693,689],{"id":694},"how-to-scrape-cloudflare-protected-sites-in-2026-a-practical-approach",[15,696,697,698,700],{},"Cloudflare protects a large share of the web, and its bot management has gotten much harder to beat. If you've hit the \"Checking your browser\" interstitial, a Turnstile challenge, or a silent ",[19,699,21],{},", this is what's actually happening and how to get through it reliably.",[24,702,704],{"id":703},"what-cloudflare-actually-checks","What Cloudflare actually checks",[15,706,707],{},"Cloudflare doesn't rely on one signal. It scores you across several layers, and failing any one can flag you:",[597,709,710,723,729,739,745],{},[497,711,712,715,716,719,720,722],{},[69,713,714],{},"TLS fingerprint (JA3\u002FJA4)."," The way your HTTP client negotiates TLS reveals whether you're a real browser or a Python ",[19,717,718],{},"requests"," session. This is why plain ",[19,721,718],{}," gets blocked instantly, before any JavaScript runs.",[497,724,725,728],{},[69,726,727],{},"HTTP\u002F2 fingerprint."," Header order, pseudo-header order, and frame settings differ between real Chrome and automation libraries.",[497,730,731,734,735,738],{},[69,732,733],{},"Browser fingerprint."," JavaScript challenges probe ",[19,736,737],{},"navigator.webdriver",", WebGL, canvas, installed fonts, screen properties, and dozens of other values.",[497,740,741,744],{},[69,742,743],{},"Behavioral signals."," Mouse movement, timing, and navigation patterns.",[497,746,747,750],{},[69,748,749],{},"IP reputation."," Datacenter IPs start with a low trust score.",[15,752,753,754,757],{},"The takeaway: ",[69,755,756],{},"a scraper that fixes only one layer still fails."," Clean IP with a headless fingerprint? Blocked. Perfect fingerprint from a flagged datacenter IP? Blocked.",[24,759,761],{"id":760},"why-plain-http-clients-cant-win","Why plain HTTP clients can't win",[15,763,764,765,767,768,771,772,775],{},"A request from ",[19,766,718],{}," or ",[19,769,770],{},"httpx"," is rejected at the TLS layer before Cloudflare even serves the challenge. Libraries like ",[19,773,774],{},"curl_cffi"," help by impersonating a real browser's TLS fingerprint:",[128,777,779],{"className":130,"code":778,"language":132,"meta":133,"style":133},"from curl_cffi import requests\n\n# Impersonate a real Chrome TLS + HTTP2 fingerprint\nresp = requests.get(\n    \"https:\u002F\u002Fprotected-site.com\",\n    impersonate=\"chrome131\",\n    timeout=20,\n)\nprint(resp.status_code)\n",[19,780,781,786,790,795,799,804,809,813,817],{"__ignoreMap":133},[137,782,783],{"class":139,"line":140},[137,784,785],{},"from curl_cffi import requests\n",[137,787,788],{"class":139,"line":146},[137,789,150],{"emptyLinePlaceholder":149},[137,791,792],{"class":139,"line":153},[137,793,794],{},"# Impersonate a real Chrome TLS + HTTP2 fingerprint\n",[137,796,797],{"class":139,"line":159},[137,798,178],{},[137,800,801],{"class":139,"line":164},[137,802,803],{},"    \"https:\u002F\u002Fprotected-site.com\",\n",[137,805,806],{"class":139,"line":170},[137,807,808],{},"    impersonate=\"chrome131\",\n",[137,810,811],{"class":139,"line":175},[137,812,196],{},[137,814,815],{"class":139,"line":181},[137,816,202],{},[137,818,819],{"class":139,"line":187},[137,820,821],{},"print(resp.status_code)\n",[15,823,824,825,828],{},"This gets you past the TLS check and works on Cloudflare's ",[213,826,827],{},"lower"," security settings. But on sites running a managed challenge or Turnstile, you need a real browser to execute the JavaScript.",[24,830,832],{"id":831},"the-reliable-approach-a-stealth-browser","The reliable approach: a stealth browser",[15,834,835,836,839],{},"For managed challenges, run an actual browser with anti-detection patches. With Playwright, the base setup looks like this, but the stock launch is ",[213,837,838],{},"not"," enough:",[128,841,843],{"className":130,"code":842,"language":132,"meta":133,"style":133},"from playwright.async_api import async_playwright\n\nasync def scrape(url: str):\n    async with async_playwright() as p:\n        browser = await p.chromium.launch(\n            headless=True,\n            args=[\n                \"--disable-blink-features=AutomationControlled\",\n            ],\n            proxy={\n                \"server\": \"http:\u002F\u002Fgateway.provider.com:7000\",\n                \"username\": \"USER\",\n                \"password\": \"PASS\",\n            },\n        )\n        ctx = await browser.new_context(\n            user_agent=\"Mozilla\u002F5.0 (Windows NT 10.0; Win64; x64) \"\n                       \"AppleWebKit\u002F537.36 (KHTML, like Gecko) \"\n                       \"Chrome\u002F131.0.0.0 Safari\u002F537.36\",\n            viewport={\"width\": 1920, \"height\": 1080},\n            locale=\"en-US\",\n        )\n        page = await ctx.new_page()\n        await page.goto(url, wait_until=\"domcontentloaded\")\n        # Wait out the challenge, then read the real content\n        await page.wait_for_load_state(\"networkidle\")\n        return await page.content()\n",[19,844,845,849,853,858,862,866,871,876,881,886,890,894,898,902,906,910,915,920,925,930,935,940,944,949,954,959,964],{"__ignoreMap":133},[137,846,847],{"class":139,"line":140},[137,848,400],{},[137,850,851],{"class":139,"line":146},[137,852,150],{"emptyLinePlaceholder":149},[137,854,855],{"class":139,"line":153},[137,856,857],{},"async def scrape(url: str):\n",[137,859,860],{"class":139,"line":159},[137,861,414],{},[137,863,864],{"class":139,"line":164},[137,865,419],{},[137,867,868],{"class":139,"line":170},[137,869,870],{},"            headless=True,\n",[137,872,873],{"class":139,"line":175},[137,874,875],{},"            args=[\n",[137,877,878],{"class":139,"line":181},[137,879,880],{},"                \"--disable-blink-features=AutomationControlled\",\n",[137,882,883],{"class":139,"line":187},[137,884,885],{},"            ],\n",[137,887,888],{"class":139,"line":193},[137,889,424],{},[137,891,892],{"class":139,"line":199},[137,893,429],{},[137,895,896],{"class":139,"line":205},[137,897,434],{},[137,899,900],{"class":139,"line":288},[137,901,439],{},[137,903,904],{"class":139,"line":294},[137,905,444],{},[137,907,908],{"class":139,"line":300},[137,909,449],{},[137,911,912],{"class":139,"line":306},[137,913,914],{},"        ctx = await browser.new_context(\n",[137,916,917],{"class":139,"line":312},[137,918,919],{},"            user_agent=\"Mozilla\u002F5.0 (Windows NT 10.0; Win64; x64) \"\n",[137,921,922],{"class":139,"line":318},[137,923,924],{},"                       \"AppleWebKit\u002F537.36 (KHTML, like Gecko) \"\n",[137,926,927],{"class":139,"line":324},[137,928,929],{},"                       \"Chrome\u002F131.0.0.0 Safari\u002F537.36\",\n",[137,931,932],{"class":139,"line":330},[137,933,934],{},"            viewport={\"width\": 1920, \"height\": 1080},\n",[137,936,937],{"class":139,"line":336},[137,938,939],{},"            locale=\"en-US\",\n",[137,941,942],{"class":139,"line":342},[137,943,449],{},[137,945,946],{"class":139,"line":348},[137,947,948],{},"        page = await ctx.new_page()\n",[137,950,951],{"class":139,"line":354},[137,952,953],{},"        await page.goto(url, wait_until=\"domcontentloaded\")\n",[137,955,956],{"class":139,"line":360},[137,957,958],{},"        # Wait out the challenge, then read the real content\n",[137,960,961],{"class":139,"line":366},[137,962,963],{},"        await page.wait_for_load_state(\"networkidle\")\n",[137,965,967],{"class":139,"line":966},27,[137,968,969],{},"        return await page.content()\n",[15,971,972,973,975,976,979,980,983],{},"The hidden work is in the patches that hide automation: removing ",[19,974,737],{},", spoofing the permissions API, faking plugins and WebGL vendor strings, and matching the user-agent to the actual browser build. Tools like ",[19,977,978],{},"playwright-stealth",", ",[19,981,982],{},"undetected-chromedialog",", or the Camoufox\u002Fnodriver projects automate much of this, but they need maintenance as Cloudflare updates its detection.",[24,985,987],{"id":986},"residential-proxies-are-not-optional-here","Residential proxies are not optional here",[15,989,990,991,994],{},"On Cloudflare-protected sites, datacenter IPs start with a trust deficit you usually can't overcome. Pair the stealth browser with residential or mobile proxies, and ",[69,992,993],{},"match the proxy country to the site's audience",". A US store accessed through a foreign IP often triggers extra verification even when everything else is perfect.",[15,996,997,998,1001],{},"See my detailed guide on ",[639,999,1000],{"href":671},"integrating rotating proxies"," for the rotation and retry logic.",[24,1003,1005],{"id":1004},"handling-turnstile-challenges","Handling Turnstile challenges",[15,1007,1008],{},"When a Turnstile or interactive challenge appears, you have two paths:",[494,1010,1011,1017],{},[497,1012,1013,1016],{},[69,1014,1015],{},"Let the stealth browser solve it passively."," With a clean fingerprint and good IP, Turnstile often passes without interaction.",[497,1018,1019,1022],{},[69,1020,1021],{},"Use a solver service"," (2Captcha, CapSolver) for the token when passive solving fails. The solver returns a token you inject into the form submission.",[15,1024,1025],{},"In practice, a well-configured stealth browser passes most non-interactive challenges on its own, and the solver is the fallback for the hardest cases.",[24,1027,1029],{"id":1028},"validate-the-response-not-just-the-status","Validate the response, not just the status",[15,1031,1032,1033,1035],{},"A ",[19,1034,617],{}," response can still be a block page. Always check the body:",[128,1037,1039],{"className":130,"code":1038,"language":132,"meta":133,"style":133},"def is_blocked(html: str) -> bool:\n    markers = [\n        \"cf-challenge\",\n        \"Checking your browser\",\n        \"Just a moment\",\n        \"cf-turnstile\",\n    ]\n    return any(m in html for m in markers)\n",[19,1040,1041,1046,1051,1056,1061,1066,1071,1076],{"__ignoreMap":133},[137,1042,1043],{"class":139,"line":140},[137,1044,1045],{},"def is_blocked(html: str) -> bool:\n",[137,1047,1048],{"class":139,"line":146},[137,1049,1050],{},"    markers = [\n",[137,1052,1053],{"class":139,"line":153},[137,1054,1055],{},"        \"cf-challenge\",\n",[137,1057,1058],{"class":139,"line":159},[137,1059,1060],{},"        \"Checking your browser\",\n",[137,1062,1063],{"class":139,"line":164},[137,1064,1065],{},"        \"Just a moment\",\n",[137,1067,1068],{"class":139,"line":170},[137,1069,1070],{},"        \"cf-turnstile\",\n",[137,1072,1073],{"class":139,"line":175},[137,1074,1075],{},"    ]\n",[137,1077,1078],{"class":139,"line":181},[137,1079,1080],{},"    return any(m in html for m in markers)\n",[15,1082,1083,1084,1087],{},"If ",[19,1085,1086],{},"is_blocked()"," returns true, rotate the proxy, back off, and retry. Do not treat it as success.",[24,1089,1091],{"id":1090},"when-this-gets-hard","When this gets hard",[15,1093,1094],{},"Cloudflare updates its detection continuously, so a setup that works today can break next month. A production scraper needs monitoring, alerting on block-rate spikes, and a maintenance plan, not a one-off script. That ongoing reliability is the real deliverable, and it's where most DIY scrapers fall apart.",[24,1096,1098],{"id":1097},"need-a-cloudflare-protected-site-scraped-reliably","Need a Cloudflare-protected site scraped reliably?",[15,1100,1101,1102,1105,1106,1108],{},"I build and maintain production scrapers that get through Cloudflare, DataDome, and Akamai, with the stealth, proxy, and monitoring infrastructure to keep them running. If you have a project, ",[639,1103,644],{"href":641,"rel":1104},[643]," or reach out via the ",[639,1107,649],{"href":648},". I respond within 24 hours.",[652,1110,654],{},{"title":133,"searchDepth":146,"depth":146,"links":1112},[1113,1114,1115,1116,1117,1118,1119,1120],{"id":703,"depth":146,"text":704},{"id":760,"depth":146,"text":761},{"id":831,"depth":146,"text":832},{"id":986,"depth":146,"text":987},{"id":1004,"depth":146,"text":1005},{"id":1028,"depth":146,"text":1029},{"id":1090,"depth":146,"text":1091},{"id":1097,"depth":146,"text":1098},"2026-06-10","What Cloudflare actually checks, why most scrapers fail against it, and the layered approach of stealth browsers, fingerprinting, and residential proxies that reliably gets through.",{},"\u002Fblog\u002Fbypass-cloudflare-web-scraping","7 min read",{"title":689,"description":1122},"blog\u002Fbypass-cloudflare-web-scraping",[676,1129,678,679,132],"cloudflare",[1131,1132,1133,1134],"Cloudflare scores you across TLS, HTTP\u002F2, browser fingerprint, behavior, and IP reputation.","Plain HTTP clients fail at the TLS layer; curl_cffi can impersonate a real browser.","Managed challenges need a real, patched stealth browser, not a stock headless launch.","Residential proxies matched to the site's country are required, and a 200 response can still be a block page.","t9vyXQhugYzXupp7mEvMSK7IHv5_iFwn9OJWAdEY8jQ",{"id":1137,"title":1138,"body":1139,"date":1666,"description":1667,"draft":668,"extension":669,"meta":1668,"navigation":149,"path":1669,"readingTime":672,"seo":1670,"stem":1671,"tags":1672,"takeaways":1675,"updated":685,"__hash__":1680},"blog\u002Fblog\u002Fsolving-captchas-2captcha-capsolver.md","Solving CAPTCHAs in Your Scraper with 2Captcha and CapSolver",{"type":8,"value":1140,"toc":1656},[1141,1144,1147,1151,1154,1157,1171,1174,1178,1181,1260,1263,1267,1274,1415,1419,1422,1470,1481,1485,1488,1597,1601,1604,1634,1638,1641,1645,1654],[11,1142,1138],{"id":1143},"solving-captchas-in-your-scraper-with-2captcha-and-capsolver",[15,1145,1146],{},"CAPTCHAs are the wall most scrapers hit once a site decides it does not trust you. The good news is that almost every common CAPTCHA can be solved programmatically through a solving service. This guide shows how to integrate 2Captcha and CapSolver, when each one fits, and how to keep costs under control.",[24,1148,1150],{"id":1149},"how-captcha-solving-services-work","How CAPTCHA solving services work",[15,1152,1153],{},"You do not solve the CAPTCHA yourself. Instead you send the challenge to a service, the service returns a token, and you inject that token into the page exactly as a real browser would after a human passed the test.",[15,1155,1156],{},"The flow is always the same:",[494,1158,1159,1162,1165,1168],{},[497,1160,1161],{},"Detect the CAPTCHA on the page and read its site key.",[497,1163,1164],{},"Send the site key and page URL to the solving service.",[497,1166,1167],{},"Poll until the service returns a solution token.",[497,1169,1170],{},"Inject the token into the hidden form field and submit.",[15,1172,1173],{},"The token, not the image, is what the target site validates. This is why solving services work even on invisible reCAPTCHA v3 where there is no puzzle to click.",[24,1175,1177],{"id":1176},"_2captcha-vs-capsolver-which-to-pick","2Captcha vs CapSolver: which to pick",[15,1179,1180],{},"Both services cover the major CAPTCHA types. The practical differences matter more than the feature list.",[39,1182,1183,1196],{},[42,1184,1185],{},[45,1186,1187,1190,1193],{},[48,1188,1189],{},"Factor",[48,1191,1192],{},"2Captcha",[48,1194,1195],{},"CapSolver",[61,1197,1198,1209,1219,1230,1239,1250],{},[45,1199,1200,1203,1206],{},[66,1201,1202],{},"Speed",[66,1204,1205],{},"Human powered, slower",[66,1207,1208],{},"AI powered, faster",[45,1210,1211,1214,1217],{},[66,1212,1213],{},"reCAPTCHA v2",[66,1215,1216],{},"Reliable",[66,1218,1216],{},[45,1220,1221,1224,1227],{},[66,1222,1223],{},"reCAPTCHA v3",[66,1225,1226],{},"Supported",[66,1228,1229],{},"Strong",[45,1231,1232,1235,1237],{},[66,1233,1234],{},"Cloudflare Turnstile",[66,1236,1226],{},[66,1238,1229],{},[45,1240,1241,1244,1247],{},[66,1242,1243],{},"Pricing model",[66,1245,1246],{},"Per solve",[66,1248,1249],{},"Per solve, cheaper at volume",[45,1251,1252,1254,1257],{},[66,1253,59],{},[66,1255,1256],{},"Image and token tasks",[66,1258,1259],{},"High volume token tasks",[15,1261,1262],{},"Rule of thumb: start with CapSolver for speed on token based challenges, keep 2Captcha as a fallback for odd image puzzles and broad coverage.",[24,1264,1266],{"id":1265},"solving-recaptcha-v2-with-2captcha","Solving reCAPTCHA v2 with 2Captcha",[15,1268,1269,1270,1273],{},"First find the site key in the page. It sits in the ",[19,1271,1272],{},"data-sitekey"," attribute of the reCAPTCHA element. Then send it to the service.",[128,1275,1277],{"className":130,"code":1276,"language":132,"meta":133,"style":133},"import time\nimport requests\n\nAPI_KEY = \"your_2captcha_key\"\n\ndef solve_recaptcha_v2(site_key: str, page_url: str) -> str:\n    # 1. Submit the task\n    r = requests.post(\"https:\u002F\u002F2captcha.com\u002Fin.php\", data={\n        \"key\": API_KEY,\n        \"method\": \"userrecaptcha\",\n        \"googlekey\": site_key,\n        \"pageurl\": page_url,\n        \"json\": 1,\n    }).json()\n    request_id = r[\"request\"]\n\n    # 2. Poll for the token\n    for _ in range(24):\n        time.sleep(5)\n        res = requests.get(\"https:\u002F\u002F2captcha.com\u002Fres.php\", params={\n            \"key\": API_KEY,\n            \"action\": \"get\",\n            \"id\": request_id,\n            \"json\": 1,\n        }).json()\n        if res[\"status\"] == 1:\n            return res[\"request\"]  # the g-recaptcha-response token\n    raise TimeoutError(\"CAPTCHA not solved in time\")\n",[19,1278,1279,1283,1287,1291,1296,1300,1305,1310,1315,1320,1325,1330,1335,1340,1345,1350,1354,1359,1364,1369,1374,1379,1384,1389,1394,1399,1404,1409],{"__ignoreMap":133},[137,1280,1281],{"class":139,"line":140},[137,1282,544],{},[137,1284,1285],{"class":139,"line":146},[137,1286,143],{},[137,1288,1289],{"class":139,"line":153},[137,1290,150],{"emptyLinePlaceholder":149},[137,1292,1293],{"class":139,"line":159},[137,1294,1295],{},"API_KEY = \"your_2captcha_key\"\n",[137,1297,1298],{"class":139,"line":164},[137,1299,150],{"emptyLinePlaceholder":149},[137,1301,1302],{"class":139,"line":170},[137,1303,1304],{},"def solve_recaptcha_v2(site_key: str, page_url: str) -> str:\n",[137,1306,1307],{"class":139,"line":175},[137,1308,1309],{},"    # 1. Submit the task\n",[137,1311,1312],{"class":139,"line":181},[137,1313,1314],{},"    r = requests.post(\"https:\u002F\u002F2captcha.com\u002Fin.php\", data={\n",[137,1316,1317],{"class":139,"line":187},[137,1318,1319],{},"        \"key\": API_KEY,\n",[137,1321,1322],{"class":139,"line":193},[137,1323,1324],{},"        \"method\": \"userrecaptcha\",\n",[137,1326,1327],{"class":139,"line":199},[137,1328,1329],{},"        \"googlekey\": site_key,\n",[137,1331,1332],{"class":139,"line":205},[137,1333,1334],{},"        \"pageurl\": page_url,\n",[137,1336,1337],{"class":139,"line":288},[137,1338,1339],{},"        \"json\": 1,\n",[137,1341,1342],{"class":139,"line":294},[137,1343,1344],{},"    }).json()\n",[137,1346,1347],{"class":139,"line":300},[137,1348,1349],{},"    request_id = r[\"request\"]\n",[137,1351,1352],{"class":139,"line":306},[137,1353,150],{"emptyLinePlaceholder":149},[137,1355,1356],{"class":139,"line":312},[137,1357,1358],{},"    # 2. Poll for the token\n",[137,1360,1361],{"class":139,"line":318},[137,1362,1363],{},"    for _ in range(24):\n",[137,1365,1366],{"class":139,"line":324},[137,1367,1368],{},"        time.sleep(5)\n",[137,1370,1371],{"class":139,"line":330},[137,1372,1373],{},"        res = requests.get(\"https:\u002F\u002F2captcha.com\u002Fres.php\", params={\n",[137,1375,1376],{"class":139,"line":336},[137,1377,1378],{},"            \"key\": API_KEY,\n",[137,1380,1381],{"class":139,"line":342},[137,1382,1383],{},"            \"action\": \"get\",\n",[137,1385,1386],{"class":139,"line":348},[137,1387,1388],{},"            \"id\": request_id,\n",[137,1390,1391],{"class":139,"line":354},[137,1392,1393],{},"            \"json\": 1,\n",[137,1395,1396],{"class":139,"line":360},[137,1397,1398],{},"        }).json()\n",[137,1400,1401],{"class":139,"line":366},[137,1402,1403],{},"        if res[\"status\"] == 1:\n",[137,1405,1406],{"class":139,"line":966},[137,1407,1408],{},"            return res[\"request\"]  # the g-recaptcha-response token\n",[137,1410,1412],{"class":139,"line":1411},28,[137,1413,1414],{},"    raise TimeoutError(\"CAPTCHA not solved in time\")\n",[24,1416,1418],{"id":1417},"injecting-the-token-with-playwright","Injecting the token with Playwright",[15,1420,1421],{},"The token is useless until it is placed in the page and the form is submitted. Inject it into the hidden textarea reCAPTCHA expects.",[128,1423,1425],{"className":130,"code":1424,"language":132,"meta":133,"style":133},"token = solve_recaptcha_v2(site_key, page_url)\n\nawait page.evaluate(\n    \"\"\"(token) => {\n        document.querySelector('#g-recaptcha-response').value = token;\n    }\"\"\",\n    token,\n)\nawait page.click(\"button[type=submit]\")\n",[19,1426,1427,1432,1436,1441,1446,1451,1456,1461,1465],{"__ignoreMap":133},[137,1428,1429],{"class":139,"line":140},[137,1430,1431],{},"token = solve_recaptcha_v2(site_key, page_url)\n",[137,1433,1434],{"class":139,"line":146},[137,1435,150],{"emptyLinePlaceholder":149},[137,1437,1438],{"class":139,"line":153},[137,1439,1440],{},"await page.evaluate(\n",[137,1442,1443],{"class":139,"line":159},[137,1444,1445],{},"    \"\"\"(token) => {\n",[137,1447,1448],{"class":139,"line":164},[137,1449,1450],{},"        document.querySelector('#g-recaptcha-response').value = token;\n",[137,1452,1453],{"class":139,"line":170},[137,1454,1455],{},"    }\"\"\",\n",[137,1457,1458],{"class":139,"line":175},[137,1459,1460],{},"    token,\n",[137,1462,1463],{"class":139,"line":181},[137,1464,202],{},[137,1466,1467],{"class":139,"line":187},[137,1468,1469],{},"await page.click(\"button[type=submit]\")\n",[15,1471,1472,1473,1476,1477,1480],{},"For reCAPTCHA v3 there is no checkbox. The token goes into whatever field the site reads, often a hidden input the site script populates, and you usually pass a ",[19,1474,1475],{},"min_score"," and ",[19,1478,1479],{},"action"," to the solver so the returned token matches what the site expects.",[24,1482,1484],{"id":1483},"solving-cloudflare-turnstile-with-capsolver","Solving Cloudflare Turnstile with CapSolver",[15,1486,1487],{},"Turnstile is increasingly common and CapSolver handles it well. The pattern is identical, only the task type changes.",[128,1489,1491],{"className":130,"code":1490,"language":132,"meta":133,"style":133},"import requests\n\nCAPSOLVER_KEY = \"your_capsolver_key\"\n\ndef solve_turnstile(site_key: str, page_url: str) -> str:\n    create = requests.post(\"https:\u002F\u002Fapi.capsolver.com\u002FcreateTask\", json={\n        \"clientKey\": CAPSOLVER_KEY,\n        \"task\": {\n            \"type\": \"AntiTurnstileTaskProxyLess\",\n            \"websiteURL\": page_url,\n            \"websiteKey\": site_key,\n        },\n    }).json()\n    task_id = create[\"taskId\"]\n\n    while True:\n        res = requests.post(\"https:\u002F\u002Fapi.capsolver.com\u002FgetTaskResult\", json={\n            \"clientKey\": CAPSOLVER_KEY,\n            \"taskId\": task_id,\n        }).json()\n        if res[\"status\"] == \"ready\":\n            return res[\"solution\"][\"token\"]\n",[19,1492,1493,1497,1501,1506,1510,1515,1520,1525,1530,1535,1540,1545,1550,1554,1559,1563,1568,1573,1578,1583,1587,1592],{"__ignoreMap":133},[137,1494,1495],{"class":139,"line":140},[137,1496,143],{},[137,1498,1499],{"class":139,"line":146},[137,1500,150],{"emptyLinePlaceholder":149},[137,1502,1503],{"class":139,"line":153},[137,1504,1505],{},"CAPSOLVER_KEY = \"your_capsolver_key\"\n",[137,1507,1508],{"class":139,"line":159},[137,1509,150],{"emptyLinePlaceholder":149},[137,1511,1512],{"class":139,"line":164},[137,1513,1514],{},"def solve_turnstile(site_key: str, page_url: str) -> str:\n",[137,1516,1517],{"class":139,"line":170},[137,1518,1519],{},"    create = requests.post(\"https:\u002F\u002Fapi.capsolver.com\u002FcreateTask\", json={\n",[137,1521,1522],{"class":139,"line":175},[137,1523,1524],{},"        \"clientKey\": CAPSOLVER_KEY,\n",[137,1526,1527],{"class":139,"line":181},[137,1528,1529],{},"        \"task\": {\n",[137,1531,1532],{"class":139,"line":187},[137,1533,1534],{},"            \"type\": \"AntiTurnstileTaskProxyLess\",\n",[137,1536,1537],{"class":139,"line":193},[137,1538,1539],{},"            \"websiteURL\": page_url,\n",[137,1541,1542],{"class":139,"line":199},[137,1543,1544],{},"            \"websiteKey\": site_key,\n",[137,1546,1547],{"class":139,"line":205},[137,1548,1549],{},"        },\n",[137,1551,1552],{"class":139,"line":288},[137,1553,1344],{},[137,1555,1556],{"class":139,"line":294},[137,1557,1558],{},"    task_id = create[\"taskId\"]\n",[137,1560,1561],{"class":139,"line":300},[137,1562,150],{"emptyLinePlaceholder":149},[137,1564,1565],{"class":139,"line":306},[137,1566,1567],{},"    while True:\n",[137,1569,1570],{"class":139,"line":312},[137,1571,1572],{},"        res = requests.post(\"https:\u002F\u002Fapi.capsolver.com\u002FgetTaskResult\", json={\n",[137,1574,1575],{"class":139,"line":318},[137,1576,1577],{},"            \"clientKey\": CAPSOLVER_KEY,\n",[137,1579,1580],{"class":139,"line":324},[137,1581,1582],{},"            \"taskId\": task_id,\n",[137,1584,1585],{"class":139,"line":330},[137,1586,1398],{},[137,1588,1589],{"class":139,"line":336},[137,1590,1591],{},"        if res[\"status\"] == \"ready\":\n",[137,1593,1594],{"class":139,"line":342},[137,1595,1596],{},"            return res[\"solution\"][\"token\"]\n",[24,1598,1600],{"id":1599},"keeping-costs-under-control","Keeping costs under control",[15,1602,1603],{},"Solving services charge per solve, and on a large job the bill adds up fast. The cheapest CAPTCHA is the one you never trigger.",[597,1605,1606,1616,1622,1628],{},[497,1607,1608,1611,1612,1615],{},[69,1609,1610],{},"Reduce triggers first."," A clean residential IP and a realistic browser fingerprint mean fewer CAPTCHAs in the first place. Solving is the fallback, not the strategy. See my guide on ",[639,1613,1614],{"href":1124},"bypassing Cloudflare"," for the stealth side.",[497,1617,1618,1621],{},[69,1619,1620],{},"Cache sessions."," Once you pass a challenge, reuse the cookies. Do not solve a fresh CAPTCHA on every request.",[497,1623,1624,1627],{},[69,1625,1626],{},"Solve only when blocked."," Detect the CAPTCHA and call the service only if it actually appears, rather than pre solving on every page.",[497,1629,1630,1633],{},[69,1631,1632],{},"Set a budget cap."," Track solves per run and stop the job if the count spikes, which usually means your fingerprint or proxy went bad.",[24,1635,1637],{"id":1636},"when-solving-services-are-not-enough","When solving services are not enough",[15,1639,1640],{},"Some sites layer behavioral analysis on top of the CAPTCHA. A valid token from a session that never moved a mouse or scrolled can still be rejected. In those cases you need the full stealth stack: residential proxies, a patched browser fingerprint, and human like interaction timing, with the solver as one piece rather than the whole answer.",[24,1642,1644],{"id":1643},"need-captchas-handled-in-your-scraping-project","Need CAPTCHAs handled in your scraping project?",[15,1646,1647,1648,1651,1652,650],{},"I build scraping systems that combine stealth, proxy rotation, and CAPTCHA solving so they keep running on protected sites. If you have a project that keeps hitting CAPTCHAs, ",[639,1649,644],{"href":641,"rel":1650},[643]," or reach out through the ",[639,1653,649],{"href":648},[652,1655,654],{},{"title":133,"searchDepth":146,"depth":146,"links":1657},[1658,1659,1660,1661,1662,1663,1664,1665],{"id":1149,"depth":146,"text":1150},{"id":1176,"depth":146,"text":1177},{"id":1265,"depth":146,"text":1266},{"id":1417,"depth":146,"text":1418},{"id":1483,"depth":146,"text":1484},{"id":1599,"depth":146,"text":1600},{"id":1636,"depth":146,"text":1637},{"id":1643,"depth":146,"text":1644},"2026-06-08","A practical guide to integrating CAPTCHA solving services into a Python scraper. Covers reCAPTCHA v2 and v3, hCaptcha, Cloudflare Turnstile, token injection, and cost control.",{},"\u002Fblog\u002Fsolving-captchas-2captcha-capsolver",{"title":1138,"description":1667},"blog\u002Fsolving-captchas-2captcha-capsolver",[1673,676,132,1674,678],"captcha","automation",[1676,1677,1678,1679],"Solving services return a token you inject; you do not solve the puzzle yourself.","CapSolver is fast for token challenges; 2Captcha gives broad coverage as a fallback.","The cheapest CAPTCHA is one you never trigger, so reduce triggers with clean IPs and fingerprints first.","Cache sessions and solve only when actually blocked to keep costs down.","vHMGfU8-0OOVYAG-0hnOxPDnNwGpE_VU2mwP_M-vp1U",1781254278065]