Introduction
The rchroma package provides an R interface to ChromaDB, a vector database for storing and querying embeddings. This vignette demonstrates the basic usage of the package.
Installation
You can install the development version of rchroma from GitHub:
# install.packages("remotes")
remotes::install_github("cynkra/rchroma")
Installing ChromaDB
Before using rchroma, you need to have a running ChromaDB instance. The easiest way to get started is using Docker:
This will start a ChromaDB server on
http://localhost:8000
.
For other installation methods and configuration options, please refer to the ChromaDB documentation.
Basic Usage
Connecting to ChromaDB
First, we need to establish a connection to ChromaDB:
library(rchroma)
# Connect to a local ChromaDB instance
client <- chroma_connect()
# Check the connection
heartbeat(client)
version(client)
Managing Collections
Collections are the main way to organize your data in ChromaDB:
# Create a new collection
create_collection(client, "my_collection")
# List all collections
list_collections(client)
# Get a specific collection
get_collection(client, "my_collection")
Working with Documents
Documents are the basic unit of data in ChromaDB. Each document consists of text content and its associated embedding:
# Add documents with embeddings
docs <- c(
"apple fruit",
"banana fruit",
"carrot vegetable"
)
embeddings <- list(
c(1.0, 0.0, 0.0), # apple
c(0.8, 0.2, 0.0), # banana (similar to apple)
c(0.0, 0.0, 1.0) # carrot (different)
)
# Add documents to the collection
add_documents(
client,
"my_collection",
documents = docs,
ids = c("doc1", "doc2", "doc3"),
embeddings = embeddings
)
# Query similar documents using embeddings
results <- query(
client,
"my_collection",
query_embeddings = list(c(1.0, 0.0, 0.0)), # should match apple best
n_results = 2
)
Updating and Deleting
You can update or delete documents as needed:
# Update embedding separately
update_documents(
client,
"my_collection",
ids = "doc1",
embeddings = list(c(0.9, 0.1, 0.0)) # slightly different from original apple
)
# Delete documents
delete_documents(client, "my_collection", ids = "doc2") # removes banana