langchain

Commit Graph

Author	SHA1	Message	Date
Harrison Chase	db2e9c2b0d	partial variables (#1308 )	1 year ago
Tim Asp	d22651d82a	Add new iFixit document loader (#1333 ) iFixit is a wikipedia-like site that has a huge amount of open content on how to fix things, questions/answers for common troubleshooting and "things" related content that is more technical in nature. All content is licensed under CC-BY-SA-NC 3.0 Adding docs from iFixit as context for user questions like "I dropped my phone in water, what do I do?" or "My macbook pro is making a whining noise, what's wrong with it?" can yield significantly better responses than context free response from LLMs.	1 year ago
Matt Robinson	c46478d70e	feat: document loader for image files (#1330 ) ### Summary Adds a document loader for image files such as `.jpg` and `.png` files. ### Testing Run the following using the example document from the [`unstructured` repo](https://github.com/Unstructured-IO/unstructured/tree/main/example-docs). ```python from langchain.document_loaders.image import UnstructuredImageLoader loader = UnstructuredImageLoader("layout-parser-paper-fast.jpg") loader.load() ```	1 year ago
blob42	2fdb1d842b	refactoring into submodules	1 year ago
blob42	c30ef7dbc4	drop network capabilities by default, example on using networking	1 year ago
blob42	8a7871ece3	add exec_attached: attach to running container and exec cmd	1 year ago
blob42	201ecdc9ee	fix run and exec_run default commands, actually use gVisor - run and exec_run need a separate default command. Run usually executes a script while exec_run simulates an interactive session. The image templates and run funcs have been upgraded to handle both types of commands. - test: make docker tests run when docker is installed and docker lib avaialble. - test that runsc runtime is used by default when gVisor is installed. (manually removing gVisor skips the test)	1 year ago
blob42	149fe0055e	exec_run fixes to keep stdin open	1 year ago
blob42	87b5a84cfb	update tests and docstrings	1 year ago
blob42	ed97aa65af	exec_run: add timeout and delay params - use `delay` to wait for sent payload to finish - use `timeout` to control how long to wait for output	1 year ago
blob42	c9e6baf60d	image templates, enhanced wrapper building with custom prameters - quickly run or exec_run commands with sane defaults - wip image templates with parameters for common docker images - shell escaping logic - capture stdout+stderr for exec commands - added minimal testing	1 year ago
blob42	7cde1cbfc3	docker: attach to container's stdin - wip image helper for optimized params with common images - gVisor runtime checker - make tests skipped if docker installed	1 year ago
blob42	17213209e0	stream stdin and stdout to container through docker API's socket	1 year ago
blob42	895f862662	docker wrapper tool for untrusted execution	1 year ago
Harrison Chase	f61858163d	bump version to 0.0.95 (#1324 )	1 year ago
Harrison Chase	0824d65a5c	Harrison/indexing pipeline (#1317 )	1 year ago
Harrison Chase	166cda2cc6	Harrison/deeplake (#1316 ) Co-authored-by: Davit Buniatyan <d@activeloop.ai>	1 year ago
Harrison Chase	aaad6cc954	Harrison/atlas db (#1315 ) Co-authored-by: Brandon Duderstadt <brandonduderstadt@gmail.com>	1 year ago
Marc Puig	3989c793fd	Making it possible to use "certainty" as a parameter for the weaviate similarity_search (#1218 ) Checking if weaviate similarity_search kwargs contains "certainty" and use it accordingly. The minimal level of certainty must be a float, and it is computed by normalized distance.	1 year ago
Alexander Hoyle	42b892c21b	Avoid IntegrityError for SQLiteCache updates (#1286 ) While using a `SQLiteCache`, if there are duplicate `(prompt, llm, idx)` tuples passed to [`update_cache()`](`c5dd491a21/langchain/llms/base.py (L39)`), then an `IntegrityError` is thrown. This can happen when there are duplicated prompts within the same batch. This PR changes the SQLAlchemy `session.add()` to a `session.merge()` in `cache.py`, [following the solution from this SO thread](https://stackoverflow.com/questions/10322514/dealing-with-duplicate-primary-keys-on-insert-in-sqlalchemy-declarative-style). I believe this fixes #983, but not entirely sure since that also involves async Here's a minimal example of the error: ```python from pathlib import Path import langchain from langchain.cache import SQLiteCache llm = langchain.OpenAI(model_name="text-ada-001", openai_api_key=Path("/.openai_api_key").read_text().strip()) langchain.llm_cache = SQLiteCache("test_cache.db") llm.generate(['a'] * 5) ``` ``` > IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: full_llm_cache.prompt, full_llm_cache.llm, full_llm_cache.idx [SQL: INSERT INTO full_llm_cache (prompt, llm, idx, response) VALUES (?, ?, ?, ?)] [parameters: ('a', "[('_type', 'openai'), ('best_of', 1), ('frequency_penalty', 0), ('logit_bias', {}), ('max_tokens', 256), ('model_name', 'text-ada-001'), ('n', 1), ('presence_penalty', 0), ('request_timeout', None), ('stop', None), ('temperature', 0.7), ('top_p', 1)]", 0, '\n\nA is for air.\n\nA is for atmosphere.')] (Background on this error at: https://sqlalche.me/e/14/gkpj) ``` After the change, we now have the following ```python class Output: def __init__(self, text): self.text = text # make dummy data cache = SQLiteCache("test_cache_2.db") cache.update(prompt="prompt_0", llm_string="llm_0", return_val=[Output("text_0")]) cache.engine.execute("SELECT * FROM full_llm_cache").fetchall() # output > [('prompt_0', 'llm_0', 0, 'text_0')] ``` ```python # update data, before change this would have thrown an `IntegrityError` cache.update(prompt="prompt_0", llm_string="llm_0", return_val=[Output("text_0_new")]) cache.engine.execute("SELECT * FROM full_llm_cache").fetchall() # output > [('prompt_0', 'llm_0', 0, 'text_0_new')] ```	1 year ago
Harrison Chase	81abcae91a	Harrison/banana fix (#1311 ) Co-authored-by: Erik Dunteman <44653944+erik-dunteman@users.noreply.github.com>	1 year ago
Ingo Kleiber	fd9975dad7	add CoNLL-U document loader (#1297 ) I've added a simple [CoNLL-U](https://universaldependencies.org/format.html) document loader. CoNLL-U is a common format for NLP tasks and is used, for example, in the Universal Dependencies treebank corpora. The loader reads a single file in standard CoNLL-U format and returns a document.	1 year ago
Harrison Chase	002da6edc0	ruff ruff (#1203 )	1 year ago
Harrison Chase	0963096491	fix imports (#1288 )	1 year ago
Matt Robinson	2f15c11b87	feat: document loader for MS Word documents (#1282 ) ### Summary Adds a document loader for MS Word Documents. Works with both `.docx` and `.doc` files as longer as the user has installed `unstructured>=0.4.11`. ### Testing The follow workflow test the loader for both `.doc` and `.docx` files using example docs from the `unstructured` repo. #### `.docx` ```python from langchain.document_loaders import UnstructuredWordDocumentLoader filename = "../unstructured/example-docs/fake.docx" loader = UnstructuredWordDocumentLoader(filename) loader.load() ``` #### `.doc` ```python from langchain.document_loaders import UnstructuredWordDocumentLoader filename = "../unstructured/example-docs/fake.doc" loader = UnstructuredWordDocumentLoader(filename) loader.load() ```	1 year ago
Harrison Chase	96db6ed073	cleanup (#1274 )	1 year ago
Harrison Chase	7e8f832cd6	Harrison/cohere params (#1278 ) Co-authored-by: Stefano Faraggi <40745694+stepp1@users.noreply.github.com>	1 year ago
Harrison Chase	a8e88e1874	Harrison/logprobs (#1279 ) Co-authored-by: Prateek Shah <97124740+prateekspanning@users.noreply.github.com>	1 year ago
Harrison Chase	42167a1e24	Harrison/fb loader (#1277 ) Co-authored-by: Vairo Di Pasquale <vairo.dp@gmail.com>	1 year ago
Harrison Chase	bb53d9722d	Harrison/errors (#1276 ) Co-authored-by: Kevin Huo <5000881+kwhuo68@users.noreply.github.com>	1 year ago
Klein Tahiraj	8a0751dadd	adding .ipynb loader and documentation Fixes #1248 (#1252 ) `NotebookLoader.load()` loads the `.ipynb` notebook file into a `Document` object. Parameters: * `include_outputs` (bool): whether to include cell outputs in the resulting document (default is False). * `max_output_length` (int): the maximum number of characters to include from each cell output (default is 10). * `remove_newline` (bool): whether to remove newline characters from the cell sources and outputs (default is False). * `traceback` (bool): whether to include full traceback (default is False).	1 year ago
Harrison Chase	4b5d427421	Harrison/source docs (#1275 ) Co-authored-by: Tushar Dhadiwal <tushardhadiwal@users.noreply.github.com>	1 year ago
Enrico Shippole	9becdeaadf	Add Writer, Banana, Modal, StochasticAI (#1270 ) Add LLM wrappers and examples for Banana, Writer, Modal, Stochastic AI Added rigid json format for Banana and Modal	1 year ago
blob42	5457d48416	searx: add `query_suffix` parameter (#1259 ) - allows to build tools and dynamically inject extra searxh suffix in the query. example: `search.run("python library", query_suffix="site:github.com")` resulting query: `python library site:github.com` Co-authored-by: blob42 <spike@w530>	1 year ago
Harrison Chase	9381005098	fix bug with length function (#1257 )	1 year ago
Iskren Ivov Chernev	8e3cd3e0dd	Add DeepInfra LLM support (#1232 ) DeepInfra is an Inference-as-a-Service provider. Add a simple wrapper using HTTPS requests.	1 year ago
Satoru Sakamoto	d480330fae	fix to specific language transcript (#1231 ) Currently youtube loader only seems to support English audio. Changed to load videos in the specified language.	1 year ago
Harrison Chase	6085fe18d4	add ifttt tool (#1244 )	1 year ago
Jon Luo	8a35811556	Don't instruct LLM to use the LIMIT clause, which is incompatible with SQL Server (#1242 ) The current prompt specifically instructs the LLM to use the `LIMIT` clause. This will cause issues with MS SQL Server, which uses `SELECT TOP` instead of `LIMIT`. The generated SQL will use `LIMIT`; the instruction to "always limit... using the LIMIT clause" seems to override the "create a syntactically correct mssql query to run" portion. Reported here: https://github.com/hwchase17/langchain/issues/1103#issuecomment-1441144224 I don't have access to a SQL Server instance to test, but removing that part of the prompt in OpenAI Playground results in the correct `SELECT TOP` syntax, whereas keeping it in results in the `LIMIT` clause, even when instructing it to generate syntactically correct mssql. It's also still correctly using `LIMIT` in my MariaDB database. I think in this case we can assume that the model will select the appropriate method based on the dialect specified. In general, it would be nice to be able to test a suite of SQL dialects for things like dialect-specific syntax and other issues we've run into in the past, but I'm not quite sure how to best approach that yet.	1 year ago
Dennis Antela Martinez	53c67e04d4	add aleph alpha llm (#1207 ) Integrate Aleph Alpha's client into Langchain to provide access to the luminous models - more info on latest benchmarks here: https://www.aleph-alpha.com/luminous-performance-benchmarks	1 year ago
Klein Tahiraj	c6ab1bb3cb	Fixing typo in loading.py (#1235 ) Just fixing a typo I found in loading.py	1 year ago
Jon Luo	ac1320aae8	fix sqlite internal tables breaking table_info (#1224 ) With the current method used to get the SQL table info, sqlite internal schema tables are being included and are not being handled correctly by sqlalchemy because the columns have no types. This is easy to see with the Chinook database: ```python db = SQLDatabase.from_uri("sqlite:///Chinook.db") print(db.table_info) ``` ```python ... sqlalchemy.exc.CompileError: (in table 'sqlite_sequence', column 'name'): Can't generate DDL for NullType(); did you forget to specify a type on this Column? ``` SQLAlchemy 2.0 [ignores these by default](`63d90b0f44/lib/sqlalchemy/dialects/sqlite/base.py (L856-L880)`): `63d90b0f44/lib/sqlalchemy/dialects/sqlite/base.py (L2096-L2123)`	1 year ago
djacobs7	4e28982d2b	Fix typo in constitutional_ai base.py (#1216 ) Found a typo in the documentation code for the constitutional_ai module	1 year ago
blob42	424e71705d	searx: remove duplicate param (#1219 ) Co-authored-by: blob42 <spike@w530>	1 year ago
Harrison Chase	5bdb8dd6fe	Harrison/unstructured io (#1200 )	1 year ago
Harrison Chase	d90a287d8f	Harrison/updating docs (#1196 )	1 year ago
Harrison Chase	b7708bbec6	rfc: callback changes (#1165 ) conceptually, no reason a tool should know what an "agent action" is unless any objections, can change in all callback handlers	1 year ago
Harrison Chase	fb83cd4ff4	catch networkx error (#1201 )	1 year ago
Harrison Chase	44c8d8a9ac	move serpapi wrapper (#1199 ) Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>	1 year ago
Konstantin Hebenstreit	af94f1dd97	HuggingFaceEndpoint: Correct Example for ImportError (#1176 ) When I try to import the Class HuggingFaceEndpoint I get an Import Error: cannot import name 'HuggingFaceEndpoint' from 'langchain'. (langchain version 0.0.88) These two imports work fine: from langchain import HuggingFacePipeline and from langchain import HuggingFaceHub. So I corrected the import statement in the example. There is probably a better solution to this, but this fixes the Error for me.	1 year ago

1 2 3 4 5 ...

483 Commits (main)