Getting Started
Installing Ollama
Goto to the official Website of Ollama and download the latest version.
Ollama CLI
Once the tool is installed you have to use the terminal to use the ollama cli. It allows to download and run models.
ollama
Usage:
ollama [flags]
ollama [command]
Available Commands:
serve Start ollama
create Create a model from a Modelfile
show Show information for a model
run Run a model
pull Pull a model from a registry
push Push a model to a registry
list List models
ps List running models
cp Copy a model
rm Remove a model
help Help about any command
Flags:
-h, --help help for ollama
-v, --version Show version information
Use "ollama [command] --help" for more information about a command.
Some Examples
ollama serve # Starts the Ollama server
ollama create -f Modelfile.yaml # Creates a model from a Modelfile
ollama show model_name # Shows information for a specific model
ollama run model_name # Runs a model
ollama run gemma:2b
ollama run llama3
ollama pull registry/model_name:tag # Pulls a model from a registry
ollama pull gemma:2b
ollama push model_name registry/model_name:tag # Pushes a model to a registry
ollama list # Lists all models available
ollama ps # Lists all running models
ollama cp source_model_name destination_model_name # Copies a model
ollama rm model_name # Removes a model
Running in the Terminal
Once you start a model in the terminal you can start chatting with it. The model will respond to your messages. Most models have additional commands that can be used to interact with them. You can type /? to see a list of available commands.
ollama run llama3
>>> /?
Available Commands:
/set Set session variables
/show Show model information
/load <model> Load a session or model
/save <model> Save your current session
/clear Clear session context
/bye Exit
/?, /help Help for a command
/? shortcuts Help for keyboard shortcuts
Use """ to begin a multi-line message.
>>>
Use Ctrl + d or /bye to exit.
>>>
Models
On the website you can browse the available models in the Ollama Models Library.
- mixtral:latest - 8*7B - 26 GB
- llama3:latest - 8B - 4.7 GB
- mistral:latest - 7B - 4.1 GB
- llama2:latest - 7B - 3.8 GB
- phi3:latest - 3.8B - 2.4 GB
Modelfiles
A model file is the blueprint to create and share models with Ollama. It contains all the information needed to run a model, including the model name, description, and the Docker image to use. You can create a Modelfile from scratch or use an existing one as a template. The modelfiles allows to customize a model to a specific purpose, similar to GPT's.
Many modelfiles can be found on the openwebui website
Interface API
The Ollama API is a RESTful API that allows you to interact with Ollama programmatically. You can use the API to create, run, and manage models, as well as to view results and logs. The API is designed to be easy to use and well-documented, making it simple to integrate Ollama into your existing workflows. The API can be accessed through the URL: http://localhost:11434
There are different endpoints available: - http://localhost:11434/api/generate - http://localhost:11434/api/chat
Simple example how to use them with curl
:
curl http://localhost:11434/api/generate -d '{
"model": "mistral",
"prompt":"Why is the sky blue?"
}'
curl -X POST http://localhost:11434/api/generate -d '{
"model": "llama2",
"prompt": "Why is the sky blue?"
}'
curl http://localhost:11434/api/chat -d '{
"model": "mistral",
"messages": [
{ "role": "user", "content": "why is the sky blue?" }
]
}'
Info
The difference between generate
and chat
is that generate
only generates a single response, while chat
can generate multiple responses in a conversation. The chat
has a messages
array that can be used to simulate a conversation.