Models

Playto reads on-screen game text with OCR (or image recognition for stylized fonts), then translates it with the model you choose. You pick the model during setup and can switch or download additional models from Settings at any time.

Reading: OCR

Engine Default Availability Notes
WinRT OCR Windows 10 / 11, built into the OS No extra download. Handles standard game text reliably.
OneOCR (Snipping Tool) Windows 11 — uses the OCR engine bundled with the Snipping Tool app Alternative engine for cases where WinRT OCR struggles. Switch in Settings → Engine.

For scenes where OCR struggles — stylized fonts, decorative text, handwritten styles — the LLM families below also support an image-recognition mode where the model reads the game screen directly instead of going through OCR.

Translation: LLM or NMT

For translation, Playto offers two distinct approaches you can switch between:

Approach Strengths Trade-offs
LLM Rich, context-aware translation. Supports image recognition for stylized fonts. Works across language pairs. Larger download (~3 GB). Heavier on VRAM.
NMT Lightweight (hundreds of MB), faster on low-end hardware, minimal VRAM footprint. Only specific language pairs are available. Text-only — no image recognition.

Neither is strictly better; the right choice depends on your hardware and the language pair you're translating between.

How you choose

On first launch, the Setup wizard presents two paths:

  • Recommended — Playto picks the best fit for your selected language pair (NMT if it's supported, otherwise the matching LLM family).
  • Custom — choose from cards showing each model with size, VRAM requirement, and what it's good for.

After setup, you can switch models or download additional ones any time from Settings → Engine. Multiple models can coexist on disk; switching between downloaded models is instant.

Model families

Family Mode Best for Sizes
Qwen3-VL LLM (vision-capable) CJK source — Japanese, Chinese, Korean 2B / 4B (recommended) / 8B
Gemma 4 LLM (vision-capable) Latin source — English, European E2B (recommended) / E4B
Hy-MT2 LLM (text-only) Multilingual translation, no image support 1.8B / 7B
Opus-MT NMT Specific pairs — JA↔EN, EN→ES/FR/IT/PT/DE/PL ~hundreds of MB per pair

Vision-capable families can read the game screen directly when OCR struggles — for example, when a game uses stylized fonts or decorative text.

Where models come from

All curated models are downloaded directly from Hugging Face — the public standard for open-source AI model hosting. Specific upstream sources:

  • Qwen3-VL — official Qwen repository (huggingface.co/Qwen/)
  • Gemma 4 — Google's original models, GGUF-quantized by the bartowski community (huggingface.co/bartowski/)
  • Hy-MT2 — Tencent's official repository (huggingface.co/tencent/)
  • Opus-MT — upstream from Helsinki-NLP, repackaged in CTranslate2 format under huggingface.co/playto-mt/

Exact download URLs are part of Playto's open manifest, visible in the application source code — there are no hidden downloads or proxies.

All curated models ship under permissive open licenses:

  • Hy-MT2 — Apache 2.0
  • Qwen3-VL — Apache 2.0
  • Gemma 4 — Apache 2.0
  • Opus-MT — CC-BY 4.0

Because models run on your device for personal use, you stay within the scope these licenses cover. The bundled license texts are viewable from Settings → Support → Third-party licenses in the app.

After download, models live entirely on your device. By default, nothing about your gameplay leaves your computer:

  • No internet connection is needed during translation.
  • No screenshot, OCR text, or game-screen content is sent to any external translation backend.
  • Translation continues to work offline.

The one exception is the optional AI Assistant (MCP) feature — if you connect Claude or another MCP client to Playto, your learning data is shared with that assistant by your own action. Translation itself remains local.

VRAM guide

Available VRAM Suggested models
~2 GB NMT (Opus-MT) or Hy-MT2 1.8B — minimal footprint.
~4 GB Qwen3-VL 4B or Gemma 4 E2B — full LLM translation, works alongside most games.
6–8 GB Qwen3-VL 8B or Gemma 4 E4B for demanding scenes; Hy-MT2 7B for multilingual MT.
12 GB+ Any model. Game and translation run comfortably side by side.

The numbers above are VRAM available in addition to what your game needs. Playto auto-fits GPU Layers based on available VRAM, but you can override in Settings if needed.