Desent Solutions – Developer Assessment
Build a tool that archives any website in its current state — including all assets and server-side API responses — so the archived version works even when the original site is completely offline. Similar to the Wayback Machine,
Download the test website package. It includes a Docker-based sample site with static content, images, and API endpoints you can spin up and tear down to verify your archiver works.
Download Test Website (ZIP)Contains: Docker setup, Express server, HTML/CSS/JS, and image generator script
.warc or .zip# 1. Unzip and start the test website
cd archiver-test-website
node create-images.js
docker-compose up -d
# 2. Archive the site while it's running
your-tool archive http://localhost:3777
# 3. Shut down the original
docker-compose down
# 4. Serve the archive and verify everything works
your-tool serve
| Feature | What to Verify |
|---|---|
| Static HTML + CSS | Layout and styles render correctly |
| Multiple images (SVG) | All images load from archive |
| Google Fonts (external CDN) | Fonts display correctly offline |
| Fetch API → /api/products | Product list appears without server |
| Fetch API → /api/stats | Stats bar shows data offline |
| Fetch API → /api/reviews/:id | Parameterized API responses cached |
| JavaScript interactivity | Click handlers work in archive |
| CSS background-image url() | Background pattern renders |
| Responsive design | Media queries still apply |
Free choice — but justify your decisions. Some suggestions:
Crawling: Puppeteer / Playwright (headless browser) or wget/httrack
API interception: Puppeteer request interception, mitmproxy, or custom proxy
Storage: File system or SQLite
Serving: Express.js, FastAPI, or similar lightweight server
Language: Node.js, Python, or Go
Does the archived site work fully offline?
Are server-side responses captured and replayed?
Clean, readable, well-structured code
Sensible tech choices, documented trade-offs
SPA support, UI, export, etc.
1. Source code in a Git repository
2. README with setup instructions
3. Architecture document — what approach you chose and why
4. Demo — screen recording or live demo showing: archive → shutdown → offline browsing
5–7 working days
• Start with a simple wget --mirror approach to understand the baseline, then improve from there
• Puppeteer/Playwright with request interception is probably the most powerful approach for API capture
• Don't forget relative vs. absolute URL rewriting — this is where most archivers break
• Look at how the Wayback Machine / SingleFile / HTTrack solve similar problems
• Edge case: inline <style> blocks with url() references
Once your archiver is complete, submit your work for review.
Submit Your Work →