Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sibling posts are accurate about hard coding /etc/hosts, although Sci Hub needs to move to IPFS [1].

[1] https://ipfs.io/



I'd like to point out that Sci-Hub not only stores scientific papers, but it will retrieve official pdf's through an institutions proxy if it can and if necessary. It could perhaps store stuff there, but it could not provide that service there (yet).

Until then, torrents of all the files that sci-hub has retrieved (and all files uploaded into libgens scimag section) can be downloaded here. http://libgen.io/scimag/repository_torrent_notforall/

Anyone can torrent these then host them on IPFS without sci-hub's involvement. However, I suspect you'll have a hard time getting any duplicity for all of this. It's over 50mil files!


Thank you so much. My hope is that IPFS can serve as the underlying distributed object storage, and then work up from there to have an indexed distributed search system on top of that (Elasticsearch within a docker container using versioned ES index backups in IPFS? With documents referenced by their IPFS content hash for de-duplication?).


Does anybody know the total size of the repository?


Not AFAIK, but you good get a good idea by downloading all the torrent files, and extracting the torrent size from the metadata. Or maybe just download 1 torrent (since there's over 500 of them, one per 100k files), and multiply the average size per file.


~50TB


They should partition the papers into fields and sub-fields so that each person could easily host and mirror their field of interest.

This has the additional benefit of having the papers locally for easy indexing and searching.


I agree such meta data should be captured, but if the papers are converted to IPFS, it'll be easier to copy, ship, and then re-serve the data at end points.

Think of it as a [sneaker|dark]net CDN enabled by content addressable storage.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: