Update README.md

1 year ago · da320988b0
parent f209e9b06d
commit da320988b0
1 changed files with 12 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -6,6 +6,18 @@

 **UPDATE (3:58 AM CST)**: I've mirrored everything to R2, and updated the script to point to it. Note that the download command has changed (it uses a new version of the bash script) so you'll need to re-copy from this README. The safety guarantees are the same for you in the end, though, and the bandwidth is still around 36MB/s, which isn't too bad. I'm honestly too tired to update the rest of the README to reflect this slowdown; I'll just leave it the way it was for tonight. Please tweet on the [announcement thread](https://twitter.com/theshawwn/status/1632238214529400832) if anything breaks again, and I'll fix it again. </passes out>

+**UPDATE (9:51 AM CST)**: HN user MacsHeadroom left a [valuable comment](https://news.ycombinator.com/item?id=35029766):
+
+> I'm running LLaMA-65B on a single A100 80GB with 8bit quantization. $1.5/hr on vast.ai
+>
+> The output is at least as good as davinci.
+>
+> I think some early results are using bad repetition penalty and/or temperature settings. I had to set both fairly high to get the best results. (Some people are also incorrectly comparing it to chatGPT/ChatGPT API which is not a good comparison. But that's a different problem.)
+>
+> I've had it translate, write poems, tell jokes, banter, write executable code. It does it all-- and all on a single card.
+
+---
+
 This repository contains a high-speed download of LLaMA, Facebook's 65B parameter model that was recently made available via torrent. (Discussion: [Facebook LLAMA is being openly distributed via torrents](https://news.ycombinator.com/item?id=35007978))

 It downloads all model weights (7B, 13B, 30B, 65B) at around 200 MB/s: