Skip to main content

Ollama

Ollama allows you to run open-source large language models, such as Llama3.1, locally.

Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile. It optimizes setup and configuration details, including GPU usage. For a complete list of supported models and model variants, see the Ollama model library.

See this guide for more details on how to use Ollama with LangChain.

Installation and Setup

Ollama installation

Follow these instructions to set up and run a local Ollama instance.

Ollama will start as a background service automatically, if this is disabled, run:

# export OLLAMA_HOST=127.0.0.1 # environment variable to set ollama host
# export OLLAMA_PORT=11434 # environment variable to set the ollama port
ollama serve

After starting ollama, run ollama pull <model_checkpoint> to download a model from the Ollama model library.

ollama pull llama3.1

We're now ready to install the langchain-ollama partner package and run a model.

Ollama LangChain partner package install

Install the integration package with:

pip install langchain-ollama

LLM

from langchain_ollama.llms import OllamaLLM

See the notebook example here.

Chat Models

Chat Ollama

from langchain_ollama.chat_models import ChatOllama

See the notebook example here.

Ollama tool calling

Ollama tool calling uses the OpenAI compatible web server specification, and can be used with the default BaseChatModel.bind_tools() methods as described here. Make sure to select an ollama model that supports tool calling.

Embedding models

from langchain_community.embeddings import OllamaEmbeddings
API Reference:OllamaEmbeddings

See the notebook example here.


Was this page helpful?