Configuration
heyllm has no config files. Everything it needs comes from a couple of flags
and one environment variable — which is the whole point of a tiny, auditable
tool. There’s nothing to discover on disk, nothing to migrate, and nothing
hidden.
Authentication
Section titled “Authentication”heyllm needs an API key. It looks in two places, in order:
- The
--api-key <key>flag. - The
OPENAI_API_KEYenvironment variable.
The usual setup is to export the variable once (e.g. in your shell profile):
export OPENAI_API_KEY="sk-..."heyllm "hello"Pass --api-key to override it for a single call — handy when talking to a
provider other than OpenAI without changing your environment:
heyllm --api-key "$OPENROUTER_API_KEY" "hello"If no key is found by either route, heyllm prints a clear error to stderr and
exits with code 2. The key is sent as a standard Authorization: Bearer <key> header.
Choosing a model
Section titled “Choosing a model”The default model is gpt-4o-mini. Use -m / --model to pick another:
heyllm -m gpt-4o "summarize REST in 3 bullets"The model name is passed straight through to the API, so the valid values are
whatever your chosen provider accepts. With OpenRouter, for example, names are
namespaced like openai/gpt-4o-mini or anthropic/claude-3.5-sonnet; with a
local Ollama they’re whatever you’ve pulled (llama3, mistral, …).
OpenAI-compatible providers
Section titled “OpenAI-compatible providers”heyllm speaks the OpenAI Chat Completions wire format and posts to
<base-url>/chat/completions. The default base URL is
https://api.openai.com/v1, but --base-url lets it talk to anything that
implements the same API. Any trailing slashes on the base URL are normalized
before /chat/completions is appended.
OpenRouter
Section titled “OpenRouter”A single gateway in front of many providers — pass its base URL, your OpenRouter key, and a namespaced model name:
heyllm --base-url https://openrouter.ai/api/v1 \ --api-key "$OPENROUTER_API_KEY" \ -m openai/gpt-4o-mini "hi"Local Ollama
Section titled “Local Ollama”A local Ollama server exposes an OpenAI-compatible endpoint on /v1. It doesn’t
check the key, but heyllm still requires some value to be present — so set a
placeholder if you don’t already have OPENAI_API_KEY exported:
export OPENAI_API_KEY="ollama" # any non-empty valueheyllm --base-url http://localhost:11434/v1 -m llama3 "hi"Anything else
Section titled “Anything else”The same pattern works for any OpenAI-compatible endpoint — Azure OpenAI,
Together, Groq, a self-hosted vLLM, and so on. Point --base-url at the
provider’s /v1 root, supply the matching --api-key, and pick a model name it
recognizes.
Streaming behavior
Section titled “Streaming behavior”By default heyllm streams tokens as they arrive, parsed from the provider’s
server-sent events (data: lines, ignoring the final [DONE] sentinel). If a
provider doesn’t support streaming — or you simply prefer to wait for the whole
answer — pass --no-stream:
heyllm --no-stream --base-url https://some-provider/v1 "hello"Raw output for scripting
Section titled “Raw output for scripting”When you need the full API payload — for example to read token usage — use
--json to print the raw response instead of just the text. --json implies a
non-streaming request:
heyllm --json "ping" | jq '.usage'Configuration at a glance
Section titled “Configuration at a glance”| What | How | Default |
|---|---|---|
| API key | --api-key or $OPENAI_API_KEY | (none — required) |
| Model | -m / --model | gpt-4o-mini |
| Endpoint | --base-url | https://api.openai.com/v1 |
| System prompt | --system | (none) |
| Streaming | on by default; --no-stream to disable | on |