Commit Graph

257 Commits (main)

Author SHA1 Message Date
Jared Van Bortel 55d709862f Revert "typescript bindings maintenance (#2363)"
As discussed on Discord, this PR was not ready to be merged. CI fails on
it.

This reverts commit a602f7fde7.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
10 hours ago
Andreas Obersteiner a602f7fde7
typescript bindings maintenance (#2363)
* remove outdated comments

Signed-off-by: limez <limez@protonmail.com>

* simpler build from source

Signed-off-by: limez <limez@protonmail.com>

* update unix build script to create .so runtimes correctly

Signed-off-by: limez <limez@protonmail.com>

* configure ci build type, use RelWithDebInfo for dev build script

Signed-off-by: limez <limez@protonmail.com>

* add clean script

Signed-off-by: limez <limez@protonmail.com>

* fix streamed token decoding / emoji

Signed-off-by: limez <limez@protonmail.com>

* remove deprecated nCtx

Signed-off-by: limez <limez@protonmail.com>

* update typings

Signed-off-by: jacob <jacoobes@sern.dev>

update typings

Signed-off-by: jacob <jacoobes@sern.dev>

* readme,mspell

Signed-off-by: jacob <jacoobes@sern.dev>

* cuda/backend logic changes + name napi methods like their js counterparts

Signed-off-by: limez <limez@protonmail.com>

* convert llmodel example into a test, separate test suite that can run in ci

Signed-off-by: limez <limez@protonmail.com>

* update examples / naming

Signed-off-by: limez <limez@protonmail.com>

* update deps, remove the need for binding.ci.gyp, make node-gyp-build fallback easier testable

Signed-off-by: limez <limez@protonmail.com>

* make sure the assert-backend-sources.js script is published, but not the others

Signed-off-by: limez <limez@protonmail.com>

* build correctly on windows (regression on node-gyp-build)

Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>

* codespell

Signed-off-by: limez <limez@protonmail.com>

* make sure dlhandle.cpp gets linked correctly

Signed-off-by: limez <limez@protonmail.com>

* add include for check_cxx_compiler_flag call during aarch64 builds

Signed-off-by: limez <limez@protonmail.com>

* x86 > arm64 cross compilation of runtimes and bindings

Signed-off-by: limez <limez@protonmail.com>

* default to cpu instead of kompute on arm64

Signed-off-by: limez <limez@protonmail.com>

* formatting, more minimal example

Signed-off-by: limez <limez@protonmail.com>

---------

Signed-off-by: limez <limez@protonmail.com>
Signed-off-by: jacob <jacoobes@sern.dev>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>
Co-authored-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>
Co-authored-by: jacob <jacoobes@sern.dev>
15 hours ago
Jared Van Bortel 636307160e
backend: fix #includes with include-what-you-use (#2371)
Also fix a PARENT_SCOPE warning when building the backend.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 days ago
Jared Van Bortel 8ba7ef4832
dlhandle: suppress DLL errors on Windows (#2389)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 days ago
Jared Van Bortel 4e89a9c44f
backend: support non-ASCII characters in path to llmodel libs on Windows (#2388)
* backend: refactor dlhandle.h into oscompat.{cpp,h}

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* llmodel: alias std::filesystem

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* llmodel: use wide strings for paths on Windows

Using the native path representation allows us to manipulate paths and
call LoadLibraryEx without mangling non-ASCII characters.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* llmodel: prefer built-in std::filesystem functionality

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* oscompat: fix string type error

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* backend: rename oscompat back to dlhandle

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* dlhandle: fix #includes

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* dlhandle: remove another #include

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* dlhandle: move dlhandle #include

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* dlhandle: remove #includes that are covered by dlhandle.h

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* llmodel: fix #include order

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

---------

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 days ago
Jared Van Bortel e94177ee9a
llamamodel: fix embedding crash for >512 tokens after #2310 (#2383)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
6 days ago
Jared Van Bortel f047f383d0
llama.cpp: update submodule for "code" model crash workaround (#2382)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
6 days ago
Jared Van Bortel f1b4092ca6
llamamodel: fix BERT tokenization after llama.cpp update (#2381)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 days ago
Jared Van Bortel c779d8a32d
python: init_gpu fixes (#2368)
* python: tweak GPU init failure message
* llama.cpp: update submodule for use-after-free fix

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2 weeks ago
Jared Van Bortel 2025d2d15b
llmodel: add CUDA to the DLL search path if CUDA_PATH is set (#2357)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 weeks ago
Jared Van Bortel a92d266cea
cmake: fix Metal build after #2310 (#2350)
I don't understand why this is needed, but it works.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 weeks ago
Jared Van Bortel d2a99d9bc6
support the llama.cpp CUDA backend (#2310)
* rebase onto llama.cpp commit ggerganov/llama.cpp@d46dbc76f
* support for CUDA backend (enabled by default)
* partial support for Occam's Vulkan backend (disabled by default)
* partial support for HIP/ROCm backend (disabled by default)
* sync llama.cpp.cmake with upstream llama.cpp CMakeLists.txt
* changes to GPT4All backend, bindings, and chat UI to handle choice of llama.cpp backend (Kompute or CUDA)
* ship CUDA runtime with installed version
* make device selection in the UI on macOS actually do something
* model whitelist: remove dbrx, mamba, persimmon, plamo; add internlm and starcoder2

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 weeks ago
Jared Van Bortel 9f9d8e636f
backend: do not crash if GGUF lacks general.architecture (#2346)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 weeks ago
Jared Van Bortel 6d8888b267
llamamodel: free the batch in embedInternal (#2348)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 weeks ago
Jared Van Bortel 577ebd4826
mixpanel: report cpu_supports_avx2 on startup (#2299)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
1 month ago
Jared Van Bortel adaecb7a72
mixpanel: improved GPU device statistics (plus GPU sort order fix) (#2297)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
1 month ago
Jared Van Bortel c622921894
improve mixpanel usage statistics (#2238)
Other changes:
- Always display first start dialog if privacy options are unset (e.g. if the user closed GPT4All without selecting them)
- LocalDocs scanQueue is now always deferred
- Fix a potential crash in magic_match
- LocalDocs indexing is now started after the first start dialog is dismissed so usage stats are included

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
1 month ago
Jared Van Bortel ba53ab5da0
python: do not print GPU name with verbose=False, expose this info via properties (#2222)
* llamamodel: only print device used in verbose mode

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: expose backend and device via GPT4All properties

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* backend: const correctness fixes

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: bump version

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: typing fixups

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: fix segfault with closed GPT4All

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

---------

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2 months ago
Jared Van Bortel ac498f79ac
fix regressions in system prompt handling (#2219)
* python: fix system prompt being ignored
* fix unintended whitespace after system prompt

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2 months ago
Jared Van Bortel 3f8257c563
llamamodel: fix semantic typo in nomic client dynamic mode (#2216)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2 months ago
Jared Van Bortel 46818e466e
python: embedding cancel callback for nomic client dynamic mode (#2214)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2 months ago
Jared Van Bortel 459289b94c
embed4all: small fixes related to nomic client local embeddings (#2213)
* actually submit larger batches with increased n_ctx
* fix crash when llama_tokenize returns no tokens

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2 months ago
Jared Van Bortel 1b84a48c47
python: add list_gpus to the GPT4All API (#2194)
Other changes:
* fix memory leak in llmodel_available_gpu_devices
* drop model argument from llmodel_available_gpu_devices
* breaking: make GPT4All/Embed4All arguments past model_name keyword-only

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2 months ago
Jared Van Bortel 67843edc7c
backend: update llama.cpp submodule for wpm locale fix (#2163)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2 months ago
Jared Van Bortel 83ada4ca89
backend: update llama.cpp submodule for Unicode paths fix (#2162)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2 months ago
Jared Van Bortel 0455b80b7f
Embed4All: optionally count tokens, misc fixes (#2145)
Key changes:
* python: optionally return token count in Embed4All.embed
* python and docs: models2.json -> models3.json
* Embed4All: require explicit prefix for unknown models
* llamamodel: fix shouldAddBOS for Bert and Nomic Bert

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel a1bb6084ed
python: documentation update and typing improvements (#2129)
Key changes:
* revert "python: tweak constructor docstrings"
* docs: update python GPT4All and Embed4All documentation
* breaking: require keyword args to GPT4All.generate

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel 699410014a
fix non-AVX CPU detection (#2141)
* chat: fix non-AVX CPU detection on Windows
* bindings: throw exception instead of logging to console

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel 255568fb9a
python: various fixes for GPT4All and Embed4All (#2130)
Key changes:
* honor empty system prompt argument
* current_chat_session is now read-only and defaults to None
* deprecate fallback prompt template for unknown models
* fix mistakes from #2086

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel 53f109f519
llamamodel: fix macOS build (#2125)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel 406e88b59a
implement local Nomic Embed via llama.cpp (#2086)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel 5c248dbec9
models: new MPT model file without duplicated token_embd.weight (#2006)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel c19b763e03
llmodel_c: expose fakeReply to the bindings (#2061)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel f500bcf6e5
llmodel: default to a blank line between reply and next prompt (#1996)
Also make some related adjustments to the provided Alpaca-style prompt templates
and system prompts.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel 007d469034
bert: fix layer norm epsilon value (#1946)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Adam Treat f720261d46 Fix another vulnerable spot for crashes.
Signed-off-by: Adam Treat <treat.adam@gmail.com>
3 months ago
chrisbarrera f8b1069a1c
add min_p sampling parameter (#2014)
Signed-off-by: Christopher Barrera <cb@arda.tx.rr.com>
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
3 months ago
Jared Van Bortel e7f2ff189f fix some compilation warnings on macOS
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel 88e330ef0e
llama.cpp: enable Kompute support for 10 more model arches (#2005)
These are Baichuan, Bert and Nomic Bert, CodeShell, GPT-2, InternLM,
MiniCPM, Orion, Qwen, and StarCoder.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel fc6c5ea0c7
llama.cpp: gemma: allow offloading the output tensor (#1997)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel 4fc4d94be4
fix chat-style prompt templates (#1970)
Also use a new version of Mistral OpenOrca.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel 7810b757c9 llamamodel: add gemma model support
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Adam Treat d948a4f2ee Complete revamp of model loading to allow for more discreet control by
the user of the models loading behavior.

Signed-off-by: Adam Treat <treat.adam@gmail.com>
3 months ago
Jared Van Bortel 6fdec808b2 backend: update llama.cpp for faster state serialization
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel a1471becf3 backend: update llama.cpp for Intel GPU blacklist
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel eb1081d37e cmake: fix LLAMA_DIR use before set
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel e60b388a2e cmake: fix backwards LLAMA_KOMPUTE default
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel fc7e5f4a09
ci: fix missing Kompute support in python bindings (#1953)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel bf493bb048
Mixtral crash fix and python bindings v2.2.0 (#1931)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel 92c025a7f6
llamamodel: add 12 new architectures for CPU inference (#1914)
Baichuan, BLOOM, CodeShell, GPT-2, Orion, Persimmon, Phi and Phi-2,
Plamo, Qwen, Qwen2, Refact, StableLM

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago