
glm-ocr - ollama.com
GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture. The model integrates the CogViT visual encoder pre-trained on large …
deepseek-ocr - ollama.com
Nov 19, 2025 · Readme DeepSeek-OCR requires Ollama v0.13.0 or later. DeepSeek-OCR is a vision-language model that can perform token-efficient optical character recognition (OCR). Example inputs …
glm-ocr:q8_0 - ollama.com
GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture. The model integrates the CogViT visual encoder pre-trained on large …
Ollama
Search for models on Ollama. kimi-k2.5 Kimi K2.5 is an open-source, native multimodal agentic model that seamlessly integrates vision and language understanding with advanced agentic capabilities, …
glm-ocr:bf16 - ollama.com
GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture. The model integrates the CogViT visual encoder pre-trained on large …
yasserrmd/Nanonets-OCR-s - ollama.com
📄 Nanonets-OCR-s A compact (3B‑parameter) Vision‑Language OCR model that turns document images into semantically rich Markdown—recognizing tables, LaTeX, checkboxes, signatures, watermarks, …
deepseek-ocr:3b - ollama.com
Nov 19, 2025 · Readme DeepSeek-OCR requires Ollama v0.13.0 or later. DeepSeek-OCR is a vision-language model that can perform token-efficient optical character recognition (OCR). Example inputs …
MedAIBase/PaddleOCR-VL - ollama.com
Jan 17, 2026 · PaddleOCR-VL is a SOTA and resource-efficient model tailored for document parsing. Its core component is PaddleOCR-VL-0.9B, a compact yet powerful vision-language model (VLM) .
qwen3.5 - ollama.com
Qwen 3.5 is a family of open-source multimodal models that delivers exceptional utility and performance.
Vision models · Ollama
deepseek-ocr DeepSeek-OCR is a vision-language model that can perform token-efficient OCR.