You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
reflexion-human-eval/programming_runs/benchmarks
Noah Shinn e5c64d96f0 better messages 10 months ago
..
.DS_Store move benchmarks to their place 1 year ago
humaneval-py.jsonl better messages 10 months ago
humaneval-py_hardest50.jsonl move benchmarks to their place 1 year ago
humaneval-rs-hardest50.jsonl move benchmarks to their place 1 year ago
humaneval-rs-sorted.jsonl move benchmarks to their place 1 year ago
humaneval-rs.jsonl move benchmarks to their place 1 year ago
leetcode-hard-py.jsonl move benchmarks to their place 1 year ago
mbpp-py.jsonl move benchmarks to their place 1 year ago
mbpp-rs.jsonl move benchmarks to their place 1 year ago