Commit Graph

35 Commits (c6229218943002ec8fdfd6f443b3715db889f690)

Author SHA1 Message Date
Jared Van Bortel c622921894
improve mixpanel usage statistics (#2238)
Other changes:
- Always display first start dialog if privacy options are unset (e.g. if the user closed GPT4All without selecting them)
- LocalDocs scanQueue is now always deferred
- Fix a potential crash in magic_match
- LocalDocs indexing is now started after the first start dialog is dismissed so usage stats are included

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
1 month ago
Jared Van Bortel 4fc4d94be4
fix chat-style prompt templates (#1970)
Also use a new version of Mistral OpenOrca.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel bf493bb048
Mixtral crash fix and python bindings v2.2.0 (#1931)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel 061d1969f8
expose n_gpu_layers parameter of llama.cpp (#1890)
Also dynamically limit the GPU layers and context length fields to the maximum supported by the model.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel 38c61493d2 backend: update to latest commit of llama.cpp Vulkan PR
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel d1c56b8b28
Implement configurable context length (#1749) 6 months ago
Jared Van Bortel dfd8ef0186
backend: use ggml_new_graph for GGML backend v2 (#1719) 6 months ago
Jared Van Bortel 9e28dfac9c
Update to latest llama.cpp (#1706) 6 months ago
cebtenzzre fd0c501d68
backend: support GGUFv3 (#1582) 7 months ago
Cebtenzzre 3c2aa299d8 gptj: remove unused variables 8 months ago
Cebtenzzre d5d72f0361 gpt-j: update inference to match latest llama.cpp insights
- Use F16 KV cache
- Store transposed V in the cache
- Avoid unnecessary Q copy

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

ggml upstream commit 0265f0813492602fec0e1159fe61de1bf0ccaf78
8 months ago
Cebtenzzre 050e7f076e backend: port GPT-J to GGUF 8 months ago
Adam Treat 6d03b3e500 Add starcoder support. 10 months ago
Aaron Miller 40a3faeb05
Use ggml scratch bufs for mpt and gptj models (#1104)
* backend/gptj: use scratch buffers

reduces total memory required and makes eval buf not grow with n_past

* backend/mpt: use scratch bufs

* fix format-related compile warnings
11 months ago
Aaron Miller 8d19ef3909
backend: factor out common elements in model code (#1089)
* backend: factor out common structs in model code

prepping to hack on these by hopefully making there be fewer places to fix the same bug

rename

* use common buffer wrapper instead of manual malloc

* fix replit compile warnings
11 months ago
Aaron Miller b19a3e5b2c add requiredMem method to llmodel impls
most of these can just shortcut out of the model loading logic llama is a bit worse to deal with because we submodule it so I have to at least parse the hparams, and then I just use the size on disk as an estimate for the mem size (which seems reasonable since we mmap() the llama files anyway)
11 months ago
Adam Treat bd58c46da0 Initialize these to nullptr to prevent double deletion when a model fails to load. 12 months ago
niansa/tuxifan 68f9786ed9
Use operator ""_MiB (#991) 12 months ago
Aaron Miller 88616fde7f
llmodel: change tokenToString to not use string_view (#968)
fixes a definite use-after-free and likely avoids some other
potential ones - std::string will convert to a std::string_view
automatically but as soon as the std::string in question goes out of
scope it is already freed and the string_view is pointing at freed
memory - this is *mostly* fine if its returning a reference to the
tokenizer's internal vocab table but it's, imo, too easy to return a
reference to a dynamically constructed string with this as replit is
doing (and unfortunately needs to do to convert the internal whitespace
replacement symbol back to a space)
12 months ago
Adam Treat 301d2fdbea Fix up for newer models on reset context. This fixes the model from totally failing after a reset context. 1 year ago
AT bbe195ee02
Backend prompt dedup (#822)
* Deduplicated prompt() function code
1 year ago
niansa/tuxifan f3564ac6b9
Fixed tons of warnings and clazy findings (#811) 1 year ago
niansa/tuxifan d6a70ddb5f
Fixed model type for GPT-J (#815)
Signed-off-by: niansa/tuxifan <tuxifan@posteo.de>
1 year ago
Adam Treat a41bd6ac0a Trying to shrink the copy+paste code and do more code sharing between backend model impl. 1 year ago
niansa a3d08cdcd5 Dlopen better implementation management (Version 2) 1 year ago
AT 48275d0dcc
Dlopen backend 5 (#779)
Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved.
1 year ago
Adam Treat 7f9f91ad94 Revert "New tokenizer implementation for MPT and GPT-J"
This reverts commit bbcee1ced5.
1 year ago
Aaron Miller bbcee1ced5 New tokenizer implementation for MPT and GPT-J
Improves output quality by making these tokenizers more closely
match the behavior of the huggingface `tokenizers` based BPE
tokenizers these models were trained with.

Featuring:
 * Fixed unicode handling (via ICU)
 * Fixed BPE token merge handling
 * Complete added vocabulary handling
1 year ago
Adam Treat 9bfff8bfcb Add new reverse prompt for new localdocs context feature. 1 year ago
Juuso Alasuutari 81fdc28e58 llmodel: constify LLModel::threadCount() 1 year ago
aaron miller e6fd0a240d backend: fix buffer overrun in repeat penalty code
Caught with AddressSanitizer running a basic prompt test against llmodel
standalone. This fix allows ASan builds to complete a simple prompt
without illegal accesses but there are still notably several leaks.
1 year ago
kuvaus 507e913faf
gpt4all-backend: Add MSVC support to backend (#595)
* Add MSVC compatibility

* Add _MSC_VER macro

---------

Co-authored-by: kuvaus <kuvaus@users.noreply.github.com>
1 year ago
Aaron Miller d14936bfd6 backend: dedupe tokenizing code in mpt/gptj 1 year ago
Aaron Miller 4cd8bdf9a1 backend: make initial buf_size const in model impls
more unifying mpt and gptj code - this one's never written so also
changing the name to be clearer
1 year ago
Adam Treat d918b02c29 Move the llmodel C API to new top-level directory and version it. 1 year ago