I find SingleFile [0] to be a much more robust version of this. It strips out al...

tamnd · 2026-06-14T17:42:05 1781458925

It seems this repo only saves one web page?

What I'm implementing here is mirroring a whole website, with all its subpages, so you can browse it all offline. For example, all essays from paulgraham.com.

maxloh · 2026-06-14T19:17:02 1781464622

Oh, I see. In that case, feature-wise, it is actually a modern alternative to HTTrack.

I think the misunderstanding stems from the browser's "Save As" reference in the description. It is misleading. You use "Save As" to save a single page, not an entire website.

Also, the description lacks a clear explanation of the project's purpose. It would be helpful to include a sentence explaining that the program downloads an entire website, not just a single page.

nikisweeting · 2026-06-15T00:34:43 1781483683

Singlefile supports scoped recursive crawls too: https://github.com/gildas-lormeau/single-file-cli#:~:text=an...

I highly recommend reading the singlefile source or https://archiveweb.page/ to see how they handle closed shadow DOMs, cross-origin iframes, websockets, media urls, deduping large assets, etc.

sillysaurusx · 2026-06-14T23:58:08 1781481488

> For example, all essays from paulgraham.com

Not the same thing, but I made a clone of pg’s website which can be used for exactly that: https://github.com/shawwn/pg

https://shawwn.github.io/pg/

If you want to read all essays, just clone the repo and open any of the .html files. Or any of the .page files which generated them.

sdevonoes · 2026-06-14T18:00:40 1781460040

[flagged]

sermah · 2026-06-14T18:26:24 1781461584

Um. Whose website are you on right now?

ivangelion · 2026-06-14T18:41:10 1781462470

Don't come here to laugh but always great when it happens anyways.

wamatt · 2026-06-14T20:24:01 1781468641

Love love love SingleFile too. The FF extension works pretty well for a clean save.

That said, Kage looks promising if OP can combine SingleFile reproduction quality with the HTTPTrack spidering approach. SPA's are kinda tricky with archiving and do wonder how well Kage would handle that

initramfs · 2026-06-14T21:12:55 1781471575

I've seen the option in IE- .mhtml.

For some reason it displays in IE better but I don't recall seeing this option in chrome of Firefox recently..

tamnd · 2026-06-14T17:43:12 1781458992

And thanks for the link. Let me implement this single HTML feature, it looks nice to have!

maxloh · 2026-06-14T19:29:03 1781465343

Yeah. An idea on top of that is to bundle an entire website into a single HTML page, with vendored JavaScript to enable client-side routing (all of the original pages' JS is still stripped out).

That way, the page is self-contained as it is, but requires no bundled binary code to serve the site. It is actually safer security-wise.

The vendored script can be as simple as this:

  const site = {
    "path-1": "<!DOCTYPE html><html> ... </html>",
    "path-2": "<!DOCTYPE html><html> ... </html>",
    // More paths
  }

  function attachListeners() {
    for (const [path, html] of Object.entries(site)) {
      document.querySelector(`a[href=${path}]`).onclick = () => {
        document.documentElement.outerHTML = html
        attachListeners()
      }
    }
  }

  document.addEventListeners("DOMContentLoaded", attachListeners)

HelloUsername · 2026-06-14T18:13:32 1781460812

What's the difference with, any webbrowser on a computer, File -> Save as ?

nmstoker · 2026-06-14T18:19:44 1781461184

That's for a single page, this handles the whole site. Also the browser Save As options often work poorly.

dmazzoni · 2026-06-14T20:40:07 1781469607

Save As works fine for simple websites with static content.

Let's say you have a site that fetches content from a database. If you Save As, then at best you'll get a local copy of an HTML page with JS that loads the content from the same remote database. It might not work (since the local copy has a different origin), or if it does, it requires you to be online, which defeats half of the purpose.

What this project, and SingleFile, both do is save a snapshot of what the rendered page actually looks like at that moment in time. The scripts are stripped out so it runs locally and has no external dependencies.

arikrahman · 2026-06-14T20:24:52 1781468692

This is what I first thought and it's a very elegant solution, and not needlessly overcomplicated.