Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.wcr.is/llms.txt

Use this file to discover all available pages before exploring further.

The WeCareRemote AI assistant gives you control over which AI model powers your conversations and how you authenticate API requests. This page explains the configuration options available to you as a user or administrator.

Choosing your AI model

WeCareRemote supports a wide range of AI models from multiple providers. Your administrator sets the platform default, but you can override the model on individual API requests by passing the model field in your request body.
{
  "message": "What housing support is available?",
  "model": "claude-3-5-sonnet-20241022"
}
If no model is specified, the assistant uses the platform default configured for your organization. To see all available models, visit AI models supported by WeCareRemote.
For most use cases, GPT-4o and Claude 3.5 Sonnet deliver the best balance of quality and speed. If your organization requires data to remain on-premises, ask your administrator about enabling a locally hosted model via Ollama.

API authentication

When accessing the AI assistant through the REST API, every request must include a Bearer token in the Authorization header.
Authorization: Bearer <your-token>
You obtain this token by logging in to WeCareRemote. See Authenticate API requests for a step-by-step guide to getting and using your token.
Keep your Bearer token secure. Never share it publicly, commit it to version control, or expose it in client-side code. Contact your WeCareRemote administrator if you believe your token has been compromised.

Supported LLM providers

WeCareRemote connects to the following AI providers. Your organization’s administrator controls which providers are active on your instance.
Provides GPT-4o, GPT-4o-mini, GPT-4 Turbo, and GPT-3.5 Turbo. Well-suited for general conversation, summarization, and document Q&A.
Provides Claude 3.5 Sonnet, Claude 3 Haiku, and Claude 3 Opus. Known for nuanced, instruction-following responses.
Provides Gemini 1.5 Pro and Gemini 1.5 Flash via Google AI Studio or Vertex AI. Strong multimodal and long-context performance.
Provides fast inference for open-weight models including Llama 3.1 and Mixtral. Ideal for low-latency interactions.
Runs open-weight models locally on your organization’s own hardware. No data leaves your environment — the recommended option for privacy-sensitive use cases.
Provides managed access to Anthropic Claude, Meta Llama, Amazon Titan, and other models through Amazon’s infrastructure.
Provides GPT models through your organization’s Azure OpenAI resource, with enterprise-grade compliance and data residency controls.
Aggregates models from many providers — including OpenAI, Anthropic, Google, Meta, and Mistral — through a single API.
Provides capable reasoning and coding-focused models via the DeepSeek API.
Connect any API that follows the OpenAI API format — including self-hosted models, fine-tuned endpoints, or third-party providers not listed above.

Conversation history

The assistant stores your conversation history so you can pick up where you left off across sessions. Your organization’s administrator determines how long history is retained and which storage backend is used. If you have questions about data retention, contact your WeCareRemote administrator. For more details on how memory and history work, see Conversation memory and chat history.