Commit Graph

174 Commits (main)
 

Author SHA1 Message Date
cassanof a8e13b1b0f fix simple 10 months ago
Federico Cassano 0e45c6a115
Merge pull request #15 from noahshinn024/starchat
Merge Starchat into main
10 months ago
Noah Shinn 807a06578c todo file 11 months ago
Noah Shinn 1c7367fb1c Start code parsing and instruction
TODO:
- remove func signature during evaluation
- edit prompts for rust
- add parse_rust_code
11 months ago
cassanof b9d2c54114 temp fix 12 months ago
cassanof a9d34708ad use right dtype 12 months ago
cassanof 42bcfe7c23 change name 12 months ago
cassanof 98bd65153a rem dangling import 12 months ago
cassanof 020e32f7bf fix ciruclar 12 months ago
cassanof e60072c524 move gen into class 12 months ago
cassanof 97d5190a7c reqs for starchat? 12 months ago
cassanof af90f4444d added model class 12 months ago
cassanof dbfc7c6a4f runner 12 months ago
Shunyu Yao f27481d8a3
Update README.md 1 year ago
Noah Shinn 9a9d7d8b2f demos 1 year ago
Noah Shinn 0f7a737015 add runscripts 1 year ago
Noah Shinn 6a8b75ccdd update citation 1 year ago
Beck LaBash d876c4cdb4 Put HotPotQA on top 1 year ago
Beck LaBash c2159d4b93 NBs and README 1 year ago
Beck LaBash e531a5c0d6 Organize notebooks 1 year ago
Noah Shinn 4924ce40f2 add logs 1 year ago
elleven11 245fd11901 move benchmarks to their place 1 year ago
Noah Shinn b6a324f78a start run instructions 1 year ago
Noah Shinn 34ab94a3b3 start run instructions 1 year ago
Beck LaBash 5942b44c41 HotPotQA runs 1 year ago
Noah Shinn 5269ef4ae0 start v2 1 year ago
elleven11 970c487d97 reinit submodules 1 year ago
elleven11 a98e92b20a reset submodule 1 year ago
Noah Shinn 4e42b24dab start v2 1 year ago
Noah Shinn 878a144a66 alfworld and webshop 1 year ago
Noah Shinn 3148695707 note about paper 1 year ago
Noah Shinn a0162a065d update leetcode hard gym link 1 year ago
Noah Shinn d2cdf66bc2 leetcode-hard gym repo 1 year ago
Noah Shinn 9a71c64882 leetcode-hard gym repo 1 year ago
Beck LaBash 5b6a1bd990 Merge branch 'py-prompts' 1 year ago
Beck LaBash 1eb65193d9 Lazy imports for leetcode 1 year ago
Noah Shinn 1a8a569211 prompts 1 year ago
Beck LaBash c272801db6 Log implementations and test case results 1 year ago
Noah Shinn 1dce1f7a90 rs hardest 50 results 1 year ago
Beck LaBash 94e7bf7d46 Prompts 1 year ago
Noah Shinn d92b66deb1 hardest 50 py results 1 year ago
elleven11 17cf55fa12 humaneval rs hard50 1 year ago
elleven11 56303a3f78 script 1 year ago
Noah Shinn 8a2fad33b1 humaneval py hardest 50 benchmark 1 year ago
elleven11 10ae3e53b2 fix rate 1 year ago
Beck LaBash 818fc53c89 Add back dynamic imports 1 year ago
Beck LaBash a774fb783f Merge branch 'main' of https://github.com/GammaTauAI/reflexion-human-eval-private 1 year ago
Beck LaBash 7572abfa8f Change timeout handling to error propagating thread 1 year ago
Beck LaBash 59e0e30942 Change timeout handling to error propagating thread 1 year ago
Beck LaBash 8ea84f3c49 fixes to leetexec 1 year ago