Release v0.8.0
v0.8.0 – Security Hardening & Deep Crawl Recovery
⚠️ Two critical security vulnerabilities fixed — review breaking changes below.
Breaking Changes
- Hooks disabled by default — set
CRAWL4AI_HOOKS_ENABLED=trueto re-enable;__import__removed from allowed builtins - Docker API blocks
file://,javascript:,data:URLs —/execute_js,/screenshot,/pdf,/htmlendpoints now validate URL scheme (http/https/raw only)
Security Fixes
- RCE via hook imports — removed
__import__from sandbox builtins - LFI on Docker endpoints — added URL scheme validation (blocks local file access)
Features
init_scriptsfor BrowserConfig — inject JavaScript before page load for stealth evasion- Crash recovery in deep crawl —
resume_stateandon_state_changecallbacks for BFS/DFS/Best-First strategies - Prefetch mode — two-phase crawling with fast link extraction before full processing
- CDP improvements — WebSocket URL support, proper cleanup, browser connection reuse
- PDF/MHTML/screenshots for raw:/file:// URLs — render cached HTML and export formats
base_urlparameter — proper URL resolution for raw: HTML processingprocess_in_browserparameter — new browser pipeline for file-based content- HTTP strategy proxy support — non-browser crawler now supports proxy rotation and sticky sessions
- Sitemap seeder TTL cache —
cache_ttl_hoursandvalidate_sitemap_lastmodparameters
Fixes
- raw: URL parsing — fixed truncation at
#(CSS color codes) - Caching system — improved cache validation and persistence
🎉 Crawl4AI v0.8.0 Released!
📦 Installation
PyPI:
pip install crawl4ai==0.8.0
Docker:
docker pull unclecode/crawl4ai:0.8.0
docker pull unclecode/crawl4ai:latest
Note: Docker images are being built and will be available shortly. Check the Docker Release workflow for build status.
📝 What's Changed
See CHANGELOG.md for details.