langchain/docs/docs/integrations/vectorstores/chroma.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "683953b3",
   "metadata": {},
   "source": [
    "# Chroma\n",
    "\n",
    ">[Chroma](https://docs.trychroma.com/getting-started) is a AI-native open-source vector database focused on developer productivity and happiness. Chroma is licensed under Apache 2.0.\n",
    "\n",
    "\n",
    "Install Chroma with:\n",
    "\n",
    "```sh\n",
    "pip install langchain-chroma\n",
    "```\n",
    "\n",
    "Chroma runs in various modes. See below for examples of each integrated with LangChain.\n",
    "- `in-memory` - in a python script or jupyter notebook\n",
    "- `in-memory with persistance` - in a script or notebook and save/load to disk\n",
    "- `in a docker container` - as a server running your local machine or in the cloud\n",
    "\n",
    "Like any other database, you can: \n",
    "- `.add` \n",
    "- `.get` \n",
    "- `.update`\n",
    "- `.upsert`\n",
    "- `.delete`\n",
    "- `.peek`\n",
    "- and `.query` runs the similarity search.\n",
    "\n",
    "View full docs at [docs](https://docs.trychroma.com/reference/Collection). To access these methods directly, you can do `._collection.method()`\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2b5ffbf8",
   "metadata": {},
   "source": [
    "## Basic Example\n",
    "\n",
    "In this basic example, we take the most recent State of the Union Address, split it into chunks, embed it using an open-source embedding model, load it into Chroma, and then query it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "ae9fcf3e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
      "\n",
      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
      "\n",
      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
      "\n",
      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n"
     ]
    }
   ],
   "source": [
    "# import\n",
    "from langchain_chroma import Chroma\n",
    "from langchain_community.document_loaders import TextLoader\n",
    "from langchain_community.embeddings.sentence_transformer import (\n",
    "    SentenceTransformerEmbeddings,\n",
    ")\n",
    "from langchain_text_splitters import CharacterTextSplitter\n",
    "\n",
    "# load the document and split it into chunks\n",
    "loader = TextLoader(\"../../modules/state_of_the_union.txt\")\n",
    "documents = loader.load()\n",
    "\n",
    "# split it into chunks\n",
    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
    "docs = text_splitter.split_documents(documents)\n",
    "\n",
    "# create the open-source embedding function\n",
    "embedding_function = SentenceTransformerEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n",
    "\n",
    "# load it into Chroma\n",
    "db = Chroma.from_documents(docs, embedding_function)\n",
    "\n",
    "# query it\n",
    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
    "docs = db.similarity_search(query)\n",
    "\n",
    "# print results\n",
    "print(docs[0].page_content)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5c9a11cc",
   "metadata": {},
   "source": [
    "## Basic Example (including saving to disk)\n",
    "\n",
    "Extending the previous example, if you want to save to disk, simply initialize the Chroma client and pass the directory where you want the data to be saved to. \n",
    "\n",
    "`Caution`: Chroma makes a best-effort to automatically save data to disk, however multiple in-memory clients can stop each other's work. As a best practice, only have one client per path running at any given time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "49f9bd49",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
      "\n",
      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
      "\n",
      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
      "\n",
      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n"
     ]
    }
   ],
   "source": [
    "# save to disk\n",
    "db2 = Chroma.from_documents(docs, embedding_function, persist_directory=\"./chroma_db\")\n",
    "docs = db2.similarity_search(query)\n",
    "\n",
    "# load from disk\n",
    "db3 = Chroma(persist_directory=\"./chroma_db\", embedding_function=embedding_function)\n",
    "docs = db3.similarity_search(query)\n",
    "print(docs[0].page_content)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "63318cc9",
   "metadata": {},
   "source": [
    "## Passing a Chroma Client into Langchain\n",
    "\n",
    "You can also create a Chroma Client and pass it to LangChain. This is particularly useful if you want easier access to the underlying database.\n",
    "\n",
    "You can also specify the collection name that you want LangChain to use."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "22f4a0ce",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Add of existing embedding ID: 1\n",
      "Add of existing embedding ID: 2\n",
      "Add of existing embedding ID: 3\n",
      "Add of existing embedding ID: 1\n",
      "Add of existing embedding ID: 2\n",
      "Add of existing embedding ID: 3\n",
      "Add of existing embedding ID: 1\n",
      "Insert of existing embedding ID: 1\n",
      "Add of existing embedding ID: 2\n",
      "Insert of existing embedding ID: 2\n",
      "Add of existing embedding ID: 3\n",
      "Insert of existing embedding ID: 3\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "There are 3 in the collection\n"
     ]
    }
   ],
   "source": [
    "import chromadb\n",
    "\n",
    "persistent_client = chromadb.PersistentClient()\n",
    "collection = persistent_client.get_or_create_collection(\"collection_name\")\n",
    "collection.add(ids=[\"1\", \"2\", \"3\"], documents=[\"a\", \"b\", \"c\"])\n",
    "\n",
    "langchain_chroma = Chroma(\n",
    "    client=persistent_client,\n",
    "    collection_name=\"collection_name\",\n",
    "    embedding_function=embedding_function,\n",
    ")\n",
    "\n",
    "print(\"There are\", langchain_chroma._collection.count(), \"in the collection\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e9cf6d70",
   "metadata": {},
   "source": [
    "## Basic Example (using the Docker Container)\n",
    "\n",
    "You can also run the Chroma Server in a Docker container separately, create a Client to connect to it, and then pass that to LangChain. \n",
    "\n",
    "Chroma has the ability to handle multiple `Collections` of documents, but the LangChain interface expects one, so we need to specify the collection name. The default collection name used by LangChain is \"langchain\".\n",
    "\n",
    "Here is how to clone, build, and run the Docker Image:\n",
    "```sh\n",
    "git clone git@github.com:chroma-core/chroma.git\n",
    "```\n",
    "\n",
    "Edit the `docker-compose.yml` file and add `ALLOW_RESET=TRUE` under `environment`\n",
    "```yaml\n",
    "    ...\n",
    "    command: uvicorn chromadb.app:app --reload --workers 1 --host 0.0.0.0 --port 8000 --log-config log_config.yml\n",
    "    environment:\n",
    "      - IS_PERSISTENT=TRUE\n",
    "      - ALLOW_RESET=TRUE\n",
    "    ports:\n",
    "      - 8000:8000\n",
    "    ...\n",
    "```\n",
    "\n",
    "Then run `docker-compose up -d --build`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "74aee70e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
      "\n",
      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
      "\n",
      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
      "\n",
      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n"
     ]
    }
   ],
   "source": [
    "# create the chroma client\n",
    "import uuid\n",
    "\n",
    "import chromadb\n",
    "from chromadb.config import Settings\n",
    "\n",
    "client = chromadb.HttpClient(settings=Settings(allow_reset=True))\n",
    "client.reset()  # resets the database\n",
    "collection = client.create_collection(\"my_collection\")\n",
    "for doc in docs:\n",
    "    collection.add(\n",
    "        ids=[str(uuid.uuid1())], metadatas=doc.metadata, documents=doc.page_content\n",
    "    )\n",
    "\n",
    "# tell LangChain to use our client and collection name\n",
    "db4 = Chroma(\n",
    "    client=client,\n",
    "    collection_name=\"my_collection\",\n",
    "    embedding_function=embedding_function,\n",
    ")\n",
    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
    "docs = db4.similarity_search(query)\n",
    "print(docs[0].page_content)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9ed3ec50",
   "metadata": {},
   "source": [
    "## Update and Delete\n",
    "\n",
    "While building toward a real application, you want to go beyond adding data, and also update and delete data. \n",
    "\n",
    "Chroma has users provide `ids` to simplify the bookkeeping here. `ids` can be the name of the file, or a combined has like `filename_paragraphNumber`, etc.\n",
    "\n",
    "Chroma supports all these operations - though some of them are still being integrated all the way through the LangChain interface. Additional workflow improvements will be added soon.\n",
    "\n",
    "Here is a basic example showing how to do various operations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "81a02810",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'source': '../../../state_of_the_union.txt'}\n",
      "{'ids': ['1'], 'embeddings': None, 'metadatas': [{'new_value': 'hello world', 'source': '../../../state_of_the_union.txt'}], 'documents': ['Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.']}\n",
      "count before 46\n",
      "count after 45\n"
     ]
    }
   ],
   "source": [
    "# create simple ids\n",
    "ids = [str(i) for i in range(1, len(docs) + 1)]\n",
    "\n",
    "# add data\n",
    "example_db = Chroma.from_documents(docs, embedding_function, ids=ids)\n",
    "docs = example_db.similarity_search(query)\n",
    "print(docs[0].metadata)\n",
    "\n",
    "# update the metadata for a document\n",
    "docs[0].metadata = {\n",
    "    \"source\": \"../../modules/state_of_the_union.txt\",\n",
    "    \"new_value\": \"hello world\",\n",
    "}\n",
    "example_db.update_document(ids[0], docs[0])\n",
    "print(example_db._collection.get(ids=[ids[0]]))\n",
    "\n",
    "# delete the last document\n",
    "print(\"count before\", example_db._collection.count())\n",
    "example_db._collection.delete(ids=[ids[-1]])\n",
    "print(\"count after\", example_db._collection.count())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ac6bc71a",
   "metadata": {},
   "source": [
    "## Use OpenAI Embeddings\n",
    "\n",
    "Many people like to use OpenAIEmbeddings, here is how to set that up."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "42080f37-8fd1-4cec-acd9-15d2b03b2f4d",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# get a token: https://platform.openai.com/account/api-keys\n",
    "\n",
    "from getpass import getpass\n",
    "\n",
    "from langchain_openai import OpenAIEmbeddings\n",
    "\n",
    "OPENAI_API_KEY = getpass()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "c7a94d6c-b4d4-4498-9bdd-eb50c92b85c5",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "5eabdb75",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
      "\n",
      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
      "\n",
      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
      "\n",
      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n"
     ]
    }
   ],
   "source": [
    "embeddings = OpenAIEmbeddings()\n",
    "new_client = chromadb.EphemeralClient()\n",
    "openai_lc_client = Chroma.from_documents(\n",
    "    docs, embeddings, client=new_client, collection_name=\"openai_collection\"\n",
    ")\n",
    "\n",
    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
    "docs = openai_lc_client.similarity_search(query)\n",
    "print(docs[0].page_content)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6d9c28ad",
   "metadata": {},
   "source": [
    "***\n",
    "\n",
    "## Other Information"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "18152965",
   "metadata": {},
   "source": [
    "### Similarity search with score"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "346347d7",
   "metadata": {},
   "source": [
    "The returned distance score is cosine distance. Therefore, a lower score is better."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "72aaa9c8",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "docs = db.similarity_search_with_score(query)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "d88e958e",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'}),\n",
       " 1.1972057819366455)"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "docs[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "794a7552",
   "metadata": {},
   "source": [
    "### Retriever options\n",
    "\n",
    "This section goes over different options for how to use Chroma as a retriever.\n",
    "\n",
    "#### MMR\n",
    "\n",
    "In addition to using similarity search in the retriever object, you can also use `mmr`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "96ff911a",
   "metadata": {},
   "outputs": [],
   "source": [
    "retriever = db.as_retriever(search_type=\"mmr\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "f00be6d0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'})"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "retriever.invoke(query)[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "275dbd0a",
   "metadata": {},
   "source": [
    "### Filtering on metadata\n",
    "\n",
    "It can be helpful to narrow down the collection before working with it.\n",
    "\n",
    "For example, collections can be filtered on metadata using the get method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "81600dc1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'ids': [], 'embeddings': None, 'metadatas': [], 'documents': []}"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# filter collection for updated source\n",
    "example_db.get(where={\"source\": \"some_other_source\"})"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
-												improve docs for indexes (#1146)


											
										
										
											1 year ago
+								{
 								 "cells": [
 								  {
 								   "cell_type": "markdown",
 								   "id": "683953b3",
 								   "metadata": {},
 								   "source": [
 								    "# Chroma\n",
 								    "\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    ">[Chroma](https://docs.trychroma.com/getting-started) is a AI-native open-source vector database focused on developer productivity and happiness. Chroma is licensed under Apache 2.0.\n",
-												docs: improved `vectorstore` notebooks (#3724)

- Added links to the vectorstore providers
- Added installation code (it is not clear that we have to go to the
`LangChan Ecosystem` page to get installation instructions.)
											
										
										
											1 year ago
+								    "\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "\n",
 								    "Install Chroma with:\n",
 								    "\n",
 								    "```sh\n",
-												docs: langchain-chroma package (#20394)


											
										
										
											2 months ago
+								    "pip install langchain-chroma\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "```\n",
 								    "\n",
 								    "Chroma runs in various modes. See below for examples of each integrated with LangChain.\n",
 								    "- `in-memory` - in a python script or jupyter notebook\n",
 								    "- `in-memory with persistance` - in a script or notebook and save/load to disk\n",
 								    "- `in a docker container` - as a server running your local machine or in the cloud\n",
 								    "\n",
 								    "Like any other database, you can: \n",
 								    "- `.add` \n",
 								    "- `.get` \n",
 								    "- `.update`\n",
 								    "- `.upsert`\n",
 								    "- `.delete`\n",
 								    "- `.peek`\n",
 								    "- and `.query` runs the similarity search.\n",
 								    "\n",
-												Fixed documentation (#10451)

It's ._collection, not ._collection_
											
										
										
											9 months ago
+								    "View full docs at [docs](https://docs.trychroma.com/reference/Collection). To access these methods directly, you can do `._collection.method()`\n"
-												docs: improved `vectorstore` notebooks (#3724)

- Added links to the vectorstore providers
- Added installation code (it is not clear that we have to go to the
`LangChan Ecosystem` page to get installation instructions.)
											
										
										
											1 year ago
+								   ]
 								  },
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								  {
 								   "cell_type": "markdown",
 								   "id": "2b5ffbf8",
 								   "metadata": {},
 								   "source": [
 								    "## Basic Example\n",
 								    "\n",
 								    "In this basic example, we take the most recent State of the Union Address, split it into chunks, embed it using an open-source embedding model, load it into Chroma, and then query it."
-												docs: improved `vectorstore` notebooks (#3724)

- Added links to the vectorstore providers
- Added installation code (it is not clear that we have to go to the
`LangChan Ecosystem` page to get installation instructions.)
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								   "execution_count": 1,
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "id": "ae9fcf3e",
 								   "metadata": {},
-												docs: improved `vectorstore` notebooks (#3724)

- Added links to the vectorstore providers
- Added installation code (it is not clear that we have to go to the
`LangChan Ecosystem` page to get installation instructions.)
											
										
										
											1 year ago
+								   "outputs": [
 								    {
-												Fix update_document function, add test and documentation. (#5359)

# Fix for `update_document` Function in Chroma

## Summary
This pull request addresses an issue with the `update_document` function
in the Chroma class, as described in
[#5031](https://github.com/hwchase17/langchain/issues/5031#issuecomment-1562577947).
The issue was identified as an `AttributeError` raised when calling
`update_document` due to a missing corresponding method in the
`Collection` object. This fix refactors the `update_document` method in
`Chroma` to correctly interact with the `Collection` object.

## Changes
1. Fixed the `update_document` method in the `Chroma` class to correctly
call methods on the `Collection` object.
2. Added the corresponding test `test_chroma_update_document` in
`tests/integration_tests/vectorstores/test_chroma.py` to reflect the
updated method call.
3. Added an example and explanation of how to use the `update_document`
function in the Jupyter notebook tutorial for Chroma.

## Test Plan
All existing tests pass after this change. In addition, the
`test_chroma_update_document` test case now correctly checks the
functionality of `update_document`, ensuring that the function works as
expected and updates the content of documents correctly.

## Reviewers
@dev2049

This fix will ensure that users are able to use the `update_document`
function as expected, without encountering the previous
`AttributeError`. This will enhance the usability and reliability of the
Chroma class for all users.

Thank you for considering this pull request. I look forward to your
feedback and suggestions.
											
										
										
											1 year ago
+								     "name": "stdout",
-												docs: improved `vectorstore` notebooks (#3724)

- Added links to the vectorstore providers
- Added installation code (it is not clear that we have to go to the
`LangChan Ecosystem` page to get installation instructions.)
											
										
										
											1 year ago
+								     "output_type": "stream",
 								     "text": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								      "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
 								      "\n",
 								      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
 								      "\n",
 								      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
 								      "\n",
 								      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n"
-												docs: improved `vectorstore` notebooks (#3724)

- Added links to the vectorstore providers
- Added installation code (it is not clear that we have to go to the
`LangChan Ecosystem` page to get installation instructions.)
											
										
										
											1 year ago
+								     ]
 								    }
 								   ],
 								   "source": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "# import\n",
-												docs: langchain-chroma package (#20394)


											
										
										
											2 months ago
+								    "from langchain_chroma import Chroma\n",
-												docs, experimental[patch], langchain[patch], community[patch]: update storage imports (#15429)

ran 
```bash
g grep -l "langchain.vectorstores" | xargs -L 1 sed -i '' "s/langchain\.vectorstores/langchain_community.vectorstores/g"
g grep -l "langchain.document_loaders" | xargs -L 1 sed -i '' "s/langchain\.document_loaders/langchain_community.document_loaders/g"
g grep -l "langchain.chat_loaders" | xargs -L 1 sed -i '' "s/langchain\.chat_loaders/langchain_community.chat_loaders/g"
g grep -l "langchain.document_transformers" | xargs -L 1 sed -i '' "s/langchain\.document_transformers/langchain_community.document_transformers/g"
g grep -l "langchain\.graphs" | xargs -L 1 sed -i '' "s/langchain\.graphs/langchain_community.graphs/g"
g grep -l "langchain\.memory\.chat_message_histories" | xargs -L 1 sed -i '' "s/langchain\.memory\.chat_message_histories/langchain_community.chat_message_histories/g"
gco master libs/langchain/tests/unit_tests/*/test_imports.py
gco master libs/langchain/tests/unit_tests/**/test_public_api.py
```
											
										
										
											5 months ago
+								    "from langchain_community.document_loaders import TextLoader\n",
-												docs, community[patch], experimental[patch], langchain[patch], cli[pa… (#15412)

…tch]: import models from community

ran
```bash
git grep -l 'from langchain\.chat_models' | xargs -L 1 sed -i '' "s/from\ langchain\.chat_models/from\ langchain_community.chat_models/g"
git grep -l 'from langchain\.llms' | xargs -L 1 sed -i '' "s/from\ langchain\.llms/from\ langchain_community.llms/g"
git grep -l 'from langchain\.embeddings' | xargs -L 1 sed -i '' "s/from\ langchain\.embeddings/from\ langchain_community.embeddings/g"
git checkout master libs/langchain/tests/unit_tests/llms
git checkout master libs/langchain/tests/unit_tests/chat_models
git checkout master libs/langchain/tests/unit_tests/embeddings/test_imports.py
make format
cd libs/langchain; make format
cd ../experimental; make format
cd ../core; make format
```
											
										
										
											5 months ago
+								    "from langchain_community.embeddings.sentence_transformer import (\n",
 								    "    SentenceTransformerEmbeddings,\n",
 								    ")\n",
-												text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346)


											
										
										
											3 months ago
+								    "from langchain_text_splitters import CharacterTextSplitter\n",
-												docs: improved `vectorstore` notebooks (#3724)

- Added links to the vectorstore providers
- Added installation code (it is not clear that we have to go to the
`LangChan Ecosystem` page to get installation instructions.)
											
										
										
											1 year ago
+								    "\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "# load the document and split it into chunks\n",
-												Revised notebook and add delete to MyScale vector store (#11848)

- **Description:** 
  - Add `.delete` to myscale vector store. 
  - Revised vector store notebooks
- **Tag maintainer:** @baskaryan 
- **Twitter handle:** @myscaledb @mpsk_liu
											
										
										
											8 months ago
+								    "loader = TextLoader(\"../../modules/state_of_the_union.txt\")\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "documents = loader.load()\n",
-												docs: improved `vectorstore` notebooks (#3724)

- Added links to the vectorstore providers
- Added installation code (it is not clear that we have to go to the
`LangChan Ecosystem` page to get installation instructions.)
											
										
										
											1 year ago
+								    "\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "# split it into chunks\n",
 								    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
 								    "docs = text_splitter.split_documents(documents)\n",
-												docs: improved `vectorstore` notebooks (#3724)

- Added links to the vectorstore providers
- Added installation code (it is not clear that we have to go to the
`LangChan Ecosystem` page to get installation instructions.)
											
										
										
											1 year ago
+								    "\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "# create the open-source embedding function\n",
 								    "embedding_function = SentenceTransformerEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n",
 								    "\n",
 								    "# load it into Chroma\n",
 								    "db = Chroma.from_documents(docs, embedding_function)\n",
 								    "\n",
 								    "# query it\n",
 								    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
 								    "docs = db.similarity_search(query)\n",
 								    "\n",
 								    "# print results\n",
 								    "print(docs[0].page_content)"
-												improve docs for indexes (#1146)


											
										
										
											1 year ago
+								   ]
 								  },
 								  {
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "cell_type": "markdown",
 								   "id": "5c9a11cc",
 								   "metadata": {},
-												improve docs for indexes (#1146)


											
										
										
											1 year ago
+								   "source": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "## Basic Example (including saving to disk)\n",
 								    "\n",
 								    "Extending the previous example, if you want to save to disk, simply initialize the Chroma client and pass the directory where you want the data to be saved to. \n",
-												improve docs for indexes (#1146)


											
										
										
											1 year ago
+								    "\n",
-												docs: typo (#17710)


											
										
										
											3 months ago
+								    "`Caution`: Chroma makes a best-effort to automatically save data to disk, however multiple in-memory clients can stop each other's work. As a best practice, only have one client per path running at any given time."
-												improve docs for indexes (#1146)


											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								   "execution_count": 2,
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "id": "49f9bd49",
 								   "metadata": {},
-												improve docs for indexes (#1146)


											
										
										
											1 year ago
+								   "outputs": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    {
 								     "name": "stdout",
 								     "output_type": "stream",
 								     "text": [
 								      "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
 								      "\n",
 								      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
 								      "\n",
 								      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
 								      "\n",
 								      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n"
-												improve docs for indexes (#1146)


											
										
										
											1 year ago
+								     ]
 								    }
 								   ],
 								   "source": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "# save to disk\n",
 								    "db2 = Chroma.from_documents(docs, embedding_function, persist_directory=\"./chroma_db\")\n",
-												docs(vectorstores/integrations/chroma): Fix loading and saving (#7437)

- Description: Fix loading and saving code about Chroma
- Issue: the issue #7436 
- Dependencies: -
- Twitter handle: https://twitter.com/ftnext
											
										
										
											11 months ago
+								    "docs = db2.similarity_search(query)\n",
-												improve docs for indexes (#1146)


											
										
										
											1 year ago
+								    "\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "# load from disk\n",
-												docs(vectorstores/integrations/chroma): Fix loading and saving (#7437)

- Description: Fix loading and saving code about Chroma
- Issue: the issue #7436 
- Dependencies: -
- Twitter handle: https://twitter.com/ftnext
											
										
										
											11 months ago
+								    "db3 = Chroma(persist_directory=\"./chroma_db\", embedding_function=embedding_function)\n",
 								    "docs = db3.similarity_search(query)\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "print(docs[0].page_content)"
 								   ]
 								  },
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								  {
 								   "cell_type": "markdown",
 								   "id": "63318cc9",
 								   "metadata": {},
 								   "source": [
 								    "## Passing a Chroma Client into Langchain\n",
 								    "\n",
 								    "You can also create a Chroma Client and pass it to LangChain. This is particularly useful if you want easier access to the underlying database.\n",
 								    "\n",
 								    "You can also specify the collection name that you want LangChain to use."
 								   ]
 								  },
 								  {
 								   "cell_type": "code",
 								   "execution_count": 3,
 								   "id": "22f4a0ce",
 								   "metadata": {},
 								   "outputs": [
 								    {
 								     "name": "stderr",
 								     "output_type": "stream",
 								     "text": [
 								      "Add of existing embedding ID: 1\n",
 								      "Add of existing embedding ID: 2\n",
 								      "Add of existing embedding ID: 3\n",
 								      "Add of existing embedding ID: 1\n",
 								      "Add of existing embedding ID: 2\n",
 								      "Add of existing embedding ID: 3\n",
 								      "Add of existing embedding ID: 1\n",
 								      "Insert of existing embedding ID: 1\n",
 								      "Add of existing embedding ID: 2\n",
 								      "Insert of existing embedding ID: 2\n",
 								      "Add of existing embedding ID: 3\n",
 								      "Insert of existing embedding ID: 3\n"
 								     ]
 								    },
 								    {
 								     "name": "stdout",
 								     "output_type": "stream",
 								     "text": [
 								      "There are 3 in the collection\n"
 								     ]
 								    }
 								   ],
 								   "source": [
 								    "import chromadb\n",
 								    "\n",
 								    "persistent_client = chromadb.PersistentClient()\n",
 								    "collection = persistent_client.get_or_create_collection(\"collection_name\")\n",
 								    "collection.add(ids=[\"1\", \"2\", \"3\"], documents=[\"a\", \"b\", \"c\"])\n",
 								    "\n",
 								    "langchain_chroma = Chroma(\n",
 								    "    client=persistent_client,\n",
 								    "    collection_name=\"collection_name\",\n",
 								    "    embedding_function=embedding_function,\n",
 								    ")\n",
 								    "\n",
 								    "print(\"There are\", langchain_chroma._collection.count(), \"in the collection\")"
 								   ]
 								  },
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								  {
 								   "cell_type": "markdown",
 								   "id": "e9cf6d70",
 								   "metadata": {},
 								   "source": [
 								    "## Basic Example (using the Docker Container)\n",
 								    "\n",
 								    "You can also run the Chroma Server in a Docker container separately, create a Client to connect to it, and then pass that to LangChain. \n",
 								    "\n",
 								    "Chroma has the ability to handle multiple `Collections` of documents, but the LangChain interface expects one, so we need to specify the collection name. The default collection name used by LangChain is \"langchain\".\n",
 								    "\n",
 								    "Here is how to clone, build, and run the Docker Image:\n",
-												Fix ChromaDB integration -> docker container instructions (#8447)

## Description
This PR handles modifying the Chroma DB integration's documentation.
It modifies the **Docker container** example to fix the instructions
mentioned in the documentation.
In the current documentation, the below `client.reset()` line causes a
runtime error:

```py
...
client = chromadb.HttpClient(settings=Settings(allow_reset=True))
client.reset()  # resets the database
collection = client.create_collection("my_collection")
...
```

`Exception: {"error":"ValueError('Resetting is not allowed by this
configuration')"}`

This is due to the Chroma DB server needing to have the `allow_reset`
flag set to `true` there as well.
This is fixed by adding the `ALLOW_RESET=TRUE` to the `docker-compose`
file environment variable to the docker container before spinning it

## Issue
This fixes the runtime error that occurs when running the docker
container example code

## Tag Maintainer
@rlancemartin, @eyurtsev
											
										
										
											10 months ago
+								    "```sh\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "git clone git@github.com:chroma-core/chroma.git\n",
-												Fix ChromaDB integration -> docker container instructions (#8447)

## Description
This PR handles modifying the Chroma DB integration's documentation.
It modifies the **Docker container** example to fix the instructions
mentioned in the documentation.
In the current documentation, the below `client.reset()` line causes a
runtime error:

```py
...
client = chromadb.HttpClient(settings=Settings(allow_reset=True))
client.reset()  # resets the database
collection = client.create_collection("my_collection")
...
```

`Exception: {"error":"ValueError('Resetting is not allowed by this
configuration')"}`

This is due to the Chroma DB server needing to have the `allow_reset`
flag set to `true` there as well.
This is fixed by adding the `ALLOW_RESET=TRUE` to the `docker-compose`
file environment variable to the docker container before spinning it

## Issue
This fixes the runtime error that occurs when running the docker
container example code

## Tag Maintainer
@rlancemartin, @eyurtsev
											
										
										
											10 months ago
+								    "```\n",
 								    "\n",
 								    "Edit the `docker-compose.yml` file and add `ALLOW_RESET=TRUE` under `environment`\n",
 								    "```yaml\n",
 								    "    ...\n",
 								    "    command: uvicorn chromadb.app:app --reload --workers 1 --host 0.0.0.0 --port 8000 --log-config log_config.yml\n",
 								    "    environment:\n",
 								    "      - IS_PERSISTENT=TRUE\n",
 								    "      - ALLOW_RESET=TRUE\n",
 								    "    ports:\n",
 								    "      - 8000:8000\n",
 								    "    ...\n",
 								    "```\n",
 								    "\n",
 								    "Then run `docker-compose up -d --build`"
-												improve docs for indexes (#1146)


											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								   "execution_count": 4,
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "id": "74aee70e",
-												improve docs for indexes (#1146)


											
										
										
											1 year ago
+								   "metadata": {},
 								   "outputs": [
 								    {
 								     "name": "stdout",
 								     "output_type": "stream",
 								     "text": [
 								      "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
 								      "\n",
 								      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
 								      "\n",
 								      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
 								      "\n",
 								      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n"
 								     ]
 								    }
 								   ],
 								   "source": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "# create the chroma client\n",
 								    "import uuid\n",
-												DOCS: format notebooks (#13371)


											
										
										
											7 months ago
+								    "\n",
 								    "import chromadb\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "from chromadb.config import Settings\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											11 months ago
+								    "\n",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								    "client = chromadb.HttpClient(settings=Settings(allow_reset=True))\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											11 months ago
+								    "client.reset()  # resets the database\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "collection = client.create_collection(\"my_collection\")\n",
 								    "for doc in docs:\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											11 months ago
+								    "    collection.add(\n",
 								    "        ids=[str(uuid.uuid1())], metadatas=doc.metadata, documents=doc.page_content\n",
 								    "    )\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "\n",
 								    "# tell LangChain to use our client and collection name\n",
-												notebook fmt (#12498)


											
										
										
											7 months ago
+								    "db4 = Chroma(\n",
 								    "    client=client,\n",
 								    "    collection_name=\"my_collection\",\n",
 								    "    embedding_function=embedding_function,\n",
 								    ")\n",
-												fix error in chroma docker instructions (#8533)

This makes the Chroma instructions for Docker work! 


https://python.langchain.com/docs/integrations/vectorstores/chroma#basic-example-using-the-docker-container
											
										
										
											10 months ago
+								    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
 								    "docs = db4.similarity_search(query)\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											11 months ago
+								    "print(docs[0].page_content)"
-												improve docs for indexes (#1146)


											
										
										
											1 year ago
+								   ]
 								  },
-												Harrison/similarity search chroma (#1434)

Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>
											
										
										
											1 year ago
+								  {
 								   "cell_type": "markdown",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "id": "9ed3ec50",
-												Harrison/similarity search chroma (#1434)

Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "## Update and Delete\n",
 								    "\n",
 								    "While building toward a real application, you want to go beyond adding data, and also update and delete data. \n",
 								    "\n",
 								    "Chroma has users provide `ids` to simplify the bookkeeping here. `ids` can be the name of the file, or a combined has like `filename_paragraphNumber`, etc.\n",
 								    "\n",
 								    "Chroma supports all these operations - though some of them are still being integrated all the way through the LangChain interface. Additional workflow improvements will be added soon.\n",
 								    "\n",
 								    "Here is a basic example showing how to do various operations:"
 								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								   "execution_count": 5,
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "id": "81a02810",
 								   "metadata": {},
 								   "outputs": [
 								    {
 								     "name": "stdout",
 								     "output_type": "stream",
 								     "text": [
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								      "{'source': '../../../state_of_the_union.txt'}\n",
 								      "{'ids': ['1'], 'embeddings': None, 'metadatas': [{'new_value': 'hello world', 'source': '../../../state_of_the_union.txt'}], 'documents': ['Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.']}\n",
 								      "count before 46\n",
 								      "count after 45\n"
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								     ]
 								    }
 								   ],
 								   "source": [
 								    "# create simple ids\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											11 months ago
+								    "ids = [str(i) for i in range(1, len(docs) + 1)]\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "\n",
 								    "# add data\n",
 								    "example_db = Chroma.from_documents(docs, embedding_function, ids=ids)\n",
 								    "docs = example_db.similarity_search(query)\n",
 								    "print(docs[0].metadata)\n",
 								    "\n",
 								    "# update the metadata for a document\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											11 months ago
+								    "docs[0].metadata = {\n",
-												Revised notebook and add delete to MyScale vector store (#11848)

- **Description:** 
  - Add `.delete` to myscale vector store. 
  - Revised vector store notebooks
- **Tag maintainer:** @baskaryan 
- **Twitter handle:** @myscaledb @mpsk_liu
											
										
										
											8 months ago
+								    "    \"source\": \"../../modules/state_of_the_union.txt\",\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											11 months ago
+								    "    \"new_value\": \"hello world\",\n",
 								    "}\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "example_db.update_document(ids[0], docs[0])\n",
 								    "print(example_db._collection.get(ids=[ids[0]]))\n",
 								    "\n",
 								    "# delete the last document\n",
 								    "print(\"count before\", example_db._collection.count())\n",
 								    "example_db._collection.delete(ids=[ids[-1]])\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											11 months ago
+								    "print(\"count after\", example_db._collection.count())"
-												Harrison/similarity search chroma (#1434)

Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
-												Scores are explained in vectorestore docs (#5613)

# Scores in Vectorestores' Docs Are Explained

Following vectorestores can return scores with similar documents by
using `similarity_search_with_score`:
- chroma
- docarray_hnsw
- docarray_in_memory
- faiss
- myscale
- qdrant
- supabase
- vectara
- weaviate

However, in documents, these scores were either not explained at all or
explained in a way that could lead to misunderstandings (e.g., FAISS).
For instance in FAISS document: if we consider the score returned by the
function as a similarity score, we understand that a document returning
a higher score is more similar to the source document. However, since
the scores returned by the function are distance scores, we should
understand that smaller scores correspond to more similar documents.

For the libraries other than Vectara, I wrote the scores they use by
investigating from the source libraries. Since I couldn't be certain
about the score metric used by Vectara, I didn't make any changes in its
documentation. The links mentioned in Vectara's documentation became
broken due to updates, so I replaced them with working ones.

VectorStores / Retrievers / Memory
  - @dev2049

my twitter: [berkedilekoglu](https://twitter.com/berkedilekoglu)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
											
										
										
											1 year ago
+								  {
 								   "cell_type": "markdown",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "id": "ac6bc71a",
-												Scores are explained in vectorestore docs (#5613)

# Scores in Vectorestores' Docs Are Explained

Following vectorestores can return scores with similar documents by
using `similarity_search_with_score`:
- chroma
- docarray_hnsw
- docarray_in_memory
- faiss
- myscale
- qdrant
- supabase
- vectara
- weaviate

However, in documents, these scores were either not explained at all or
explained in a way that could lead to misunderstandings (e.g., FAISS).
For instance in FAISS document: if we consider the score returned by the
function as a similarity score, we understand that a document returning
a higher score is more similar to the source document. However, since
the scores returned by the function are distance scores, we should
understand that smaller scores correspond to more similar documents.

For the libraries other than Vectara, I wrote the scores they use by
investigating from the source libraries. Since I couldn't be certain
about the score metric used by Vectara, I didn't make any changes in its
documentation. The links mentioned in Vectara's documentation became
broken due to updates, so I replaced them with working ones.

VectorStores / Retrievers / Memory
  - @dev2049

my twitter: [berkedilekoglu](https://twitter.com/berkedilekoglu)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "## Use OpenAI Embeddings\n",
 								    "\n",
 								    "Many people like to use OpenAIEmbeddings, here is how to set that up."
-												Scores are explained in vectorestore docs (#5613)

# Scores in Vectorestores' Docs Are Explained

Following vectorestores can return scores with similar documents by
using `similarity_search_with_score`:
- chroma
- docarray_hnsw
- docarray_in_memory
- faiss
- myscale
- qdrant
- supabase
- vectara
- weaviate

However, in documents, these scores were either not explained at all or
explained in a way that could lead to misunderstandings (e.g., FAISS).
For instance in FAISS document: if we consider the score returned by the
function as a similarity score, we understand that a document returning
a higher score is more similar to the source document. However, since
the scores returned by the function are distance scores, we should
understand that smaller scores correspond to more similar documents.

For the libraries other than Vectara, I wrote the scores they use by
investigating from the source libraries. Since I couldn't be certain
about the score metric used by Vectara, I didn't make any changes in its
documentation. The links mentioned in Vectara's documentation became
broken due to updates, so I replaced them with working ones.

VectorStores / Retrievers / Memory
  - @dev2049

my twitter: [berkedilekoglu](https://twitter.com/berkedilekoglu)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
-												Harrison/similarity search chroma (#1434)

Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>
											
										
										
											1 year ago
+								  {
 								   "cell_type": "code",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								   "execution_count": 6,
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "id": "42080f37-8fd1-4cec-acd9-15d2b03b2f4d",
-												docs: improved `vectorstore` notebooks (#3724)

- Added links to the vectorstore providers
- Added installation code (it is not clear that we have to go to the
`LangChan Ecosystem` page to get installation instructions.)
											
										
										
											1 year ago
+								   "metadata": {
 								    "tags": []
 								   },
-												Harrison/similarity search chroma (#1434)

Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>
											
										
										
											1 year ago
+								   "outputs": [],
 								   "source": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "# get a token: https://platform.openai.com/account/api-keys\n",
 								    "\n",
 								    "from getpass import getpass\n",
-												DOCS: format notebooks (#13371)


											
										
										
											7 months ago
+								    "\n",
-												docs: langchain-openai (#15513)

Updates docs and cookbooks to import ChatOpenAI, OpenAI, and OpenAI
Embeddings from `langchain_openai`

There are likely more

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
											
										
										
											5 months ago
+								    "from langchain_openai import OpenAIEmbeddings\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "\n",
 								    "OPENAI_API_KEY = getpass()"
-												Harrison/similarity search chroma (#1434)

Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								   "execution_count": 7,
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "id": "c7a94d6c-b4d4-4498-9bdd-eb50c92b85c5",
-												docs: improved `vectorstore` notebooks (#3724)

- Added links to the vectorstore providers
- Added installation code (it is not clear that we have to go to the
`LangChan Ecosystem` page to get installation instructions.)
											
										
										
											1 year ago
+								   "metadata": {
 								    "tags": []
 								   },
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "outputs": [],
-												add docs for chroma persistance (#1202)


											
										
										
											1 year ago
+								   "source": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "import os\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											11 months ago
+								    "\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY"
-												add docs for chroma persistance (#1202)


											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								   "execution_count": 8,
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "id": "5eabdb75",
 								   "metadata": {
 								    "tags": []
 								   },
-												add docs for chroma persistance (#1202)


											
										
										
											1 year ago
+								   "outputs": [
 								    {
 								     "name": "stdout",
 								     "output_type": "stream",
 								     "text": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								      "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
 								      "\n",
 								      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
 								      "\n",
 								      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
 								      "\n",
 								      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n"
-												add docs for chroma persistance (#1202)


											
										
										
											1 year ago
+								     ]
 								    }
 								   ],
 								   "source": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "embeddings = OpenAIEmbeddings()\n",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								    "new_client = chromadb.EphemeralClient()\n",
 								    "openai_lc_client = Chroma.from_documents(\n",
 								    "    docs, embeddings, client=new_client, collection_name=\"openai_collection\"\n",
 								    ")\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "\n",
 								    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								    "docs = openai_lc_client.similarity_search(query)\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "print(docs[0].page_content)"
-												add docs for chroma persistance (#1202)


											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "markdown",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "id": "6d9c28ad",
-												add docs for chroma persistance (#1202)


											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "***\n",
 								    "\n",
 								    "## Other Information"
-												add docs for chroma persistance (#1202)


											
										
										
											1 year ago
+								   ]
 								  },
 								  {
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "cell_type": "markdown",
 								   "id": "18152965",
-												add docs for chroma persistance (#1202)


											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "### Similarity search with score"
-												add docs for chroma persistance (#1202)


											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "markdown",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "id": "346347d7",
-												add docs for chroma persistance (#1202)


											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "The returned distance score is cosine distance. Therefore, a lower score is better."
-												add docs for chroma persistance (#1202)


											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								   "execution_count": 9,
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "id": "72aaa9c8",
 								   "metadata": {
 								    "tags": []
 								   },
 								   "outputs": [],
 								   "source": [
 								    "docs = db.similarity_search_with_score(query)"
 								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								   "execution_count": 10,
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								   "id": "d88e958e",
 								   "metadata": {
 								    "tags": []
 								   },
-												add docs for chroma persistance (#1202)


											
										
										
											1 year ago
+								   "outputs": [
 								    {
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								     "data": {
 								      "text/plain": [
 								       "(Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'}),\n",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								       " 1.1972057819366455)"
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								      ]
 								     },
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								     "execution_count": 10,
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								     "metadata": {},
 								     "output_type": "execute_result"
-												add docs for chroma persistance (#1202)


											
										
										
											1 year ago
+								    }
 								   ],
 								   "source": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "docs[0]"
-												add docs for chroma persistance (#1202)


											
										
										
											1 year ago
+								   ]
 								  },
-												bump version to 128 (#2236)


											
										
										
											1 year ago
+								  {
 								   "cell_type": "markdown",
 								   "id": "794a7552",
 								   "metadata": {},
 								   "source": [
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "### Retriever options\n",
-												bump version to 128 (#2236)


											
										
										
											1 year ago
+								    "\n",
 								    "This section goes over different options for how to use Chroma as a retriever.\n",
 								    "\n",
-												update chroma notebook (#6664)

@rlancemartin I updated the notebook for Chroma to hopefully be a lot
easier for users.

											
										
										
											12 months ago
+								    "#### MMR\n",
-												bump version to 128 (#2236)


											
										
										
											1 year ago
+								    "\n",
 								    "In addition to using similarity search in the retriever object, you can also use `mmr`."
 								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								   "execution_count": 11,
-												bump version to 128 (#2236)


											
										
										
											1 year ago
+								   "id": "96ff911a",
 								   "metadata": {},
 								   "outputs": [],
 								   "source": [
 								    "retriever = db.as_retriever(search_type=\"mmr\")"
 								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								   "execution_count": 12,
-												bump version to 128 (#2236)


											
										
										
											1 year ago
+								   "id": "f00be6d0",
 								   "metadata": {},
 								   "outputs": [
 								    {
 								     "data": {
 								      "text/plain": [
 								       "Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'})"
 								      ]
 								     },
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								     "execution_count": 12,
-												bump version to 128 (#2236)


											
										
										
											1 year ago
+								     "metadata": {},
 								     "output_type": "execute_result"
 								    }
 								   ],
 								   "source": [
-												patch: deprecate (a)get_relevant_documents (#20477)

- `.get_relevant_documents(query)` -> `.invoke(query)`
- `.get_relevant_documents(query=query)` -> `.invoke(query)`
- `.get_relevant_documents(query, callbacks=callbacks)` ->
`.invoke(query, config={"callbacks": callbacks})`
- `.get_relevant_documents(query, **kwargs)` -> `.invoke(query,
**kwargs)`

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
											
										
										
											1 month ago
+								    "retriever.invoke(query)[0]"
-												bump version to 128 (#2236)


											
										
										
											1 year ago
+								   ]
-												align chroma vectorstore get with chromadb to enable where filtering (#6686)

allows for where filtering on collection via get

- Description: aligns langchain chroma vectorstore get with underlying
[chromadb collection
get](https://github.com/chroma-core/chroma/blob/main/chromadb/api/models/Collection.py#L103)
allowing for where filtering, etc.
  - Issue: NA
  - Dependencies: none
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @pappanaka
											
										
										
											11 months ago
+								  },
 								  {
 								   "cell_type": "markdown",
 								   "id": "275dbd0a",
 								   "metadata": {},
 								   "source": [
 								    "### Filtering on metadata\n",
 								    "\n",
 								    "It can be helpful to narrow down the collection before working with it.\n",
 								    "\n",
 								    "For example, collections can be filtered on metadata using the get method."
 								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								   "execution_count": 13,
-												align chroma vectorstore get with chromadb to enable where filtering (#6686)

allows for where filtering on collection via get

- Description: aligns langchain chroma vectorstore get with underlying
[chromadb collection
get](https://github.com/chroma-core/chroma/blob/main/chromadb/api/models/Collection.py#L103)
allowing for where filtering, etc.
  - Issue: NA
  - Dependencies: none
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @pappanaka
											
										
										
											11 months ago
+								   "id": "81600dc1",
 								   "metadata": {},
 								   "outputs": [
 								    {
 								     "data": {
 								      "text/plain": [
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								       "{'ids': [], 'embeddings': None, 'metadatas': [], 'documents': []}"
-												align chroma vectorstore get with chromadb to enable where filtering (#6686)

allows for where filtering on collection via get

- Description: aligns langchain chroma vectorstore get with underlying
[chromadb collection
get](https://github.com/chroma-core/chroma/blob/main/chromadb/api/models/Collection.py#L103)
allowing for where filtering, etc.
  - Issue: NA
  - Dependencies: none
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @pappanaka
											
										
										
											11 months ago
+								      ]
 								     },
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								     "execution_count": 13,
-												align chroma vectorstore get with chromadb to enable where filtering (#6686)

allows for where filtering on collection via get

- Description: aligns langchain chroma vectorstore get with underlying
[chromadb collection
get](https://github.com/chroma-core/chroma/blob/main/chromadb/api/models/Collection.py#L103)
allowing for where filtering, etc.
  - Issue: NA
  - Dependencies: none
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @pappanaka
											
										
										
											11 months ago
+								     "metadata": {},
 								     "output_type": "execute_result"
 								    }
 								   ],
 								   "source": [
 								    "# filter collection for updated source\n",
 								    "example_db.get(where={\"source\": \"some_other_source\"})"
 								   ]
-												improve docs for indexes (#1146)


											
										
										
											1 year ago
+								  }
 								 ],
 								 "metadata": {
 								  "kernelspec": {
 								   "display_name": "Python 3 (ipykernel)",
 								   "language": "python",
 								   "name": "python3"
 								  },
 								  "language_info": {
 								   "codemirror_mode": {
 								    "name": "ipython",
 								    "version": 3
 								   },
 								   "file_extension": ".py",
 								   "mimetype": "text/x-python",
 								   "name": "python",
 								   "nbconvert_exporter": "python",
 								   "pygments_lexer": "ipython3",
-												Update chroma notebook (#7978)

Fix up the Chroma notebook
- remove `.persist()` -- this is no longer in Chroma as of `0.4.0`
- update output to match `0.4.0`
- other cleanup work
											
										
										
											11 months ago
+								   "version": "3.10.10"
-												improve docs for indexes (#1146)


											
										
										
											1 year ago
+								  }
 								 },
 								 "nbformat": 4,
 								 "nbformat_minor": 5
 								}