Update README.md

pull/3/head
Shawn Presser 1 year ago committed by GitHub
parent f209e9b06d
commit da320988b0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -6,6 +6,18 @@
**UPDATE (3:58 AM CST)**: I've mirrored everything to R2, and updated the script to point to it. Note that the download command has changed (it uses a new version of the bash script) so you'll need to re-copy from this README. The safety guarantees are the same for you in the end, though, and the bandwidth is still around 36MB/s, which isn't too bad. I'm honestly too tired to update the rest of the README to reflect this slowdown; I'll just leave it the way it was for tonight. Please tweet on the [announcement thread](https://twitter.com/theshawwn/status/1632238214529400832) if anything breaks again, and I'll fix it again. </passes out>
**UPDATE (9:51 AM CST)**: HN user MacsHeadroom left a [valuable comment](https://news.ycombinator.com/item?id=35029766):
> I'm running LLaMA-65B on a single A100 80GB with 8bit quantization. $1.5/hr on vast.ai
>
> The output is at least as good as davinci.
>
> I think some early results are using bad repetition penalty and/or temperature settings. I had to set both fairly high to get the best results. (Some people are also incorrectly comparing it to chatGPT/ChatGPT API which is not a good comparison. But that's a different problem.)
>
> I've had it translate, write poems, tell jokes, banter, write executable code. It does it all-- and all on a single card.
---
This repository contains a high-speed download of LLaMA, Facebook's 65B parameter model that was recently made available via torrent. (Discussion: [Facebook LLAMA is being openly distributed via torrents](https://news.ycombinator.com/item?id=35007978))
It downloads all model weights (7B, 13B, 30B, 65B) at around 200 MB/s:

Loading…
Cancel
Save