Models

Playto reads on-screen game text with OCR (or image recognition for stylized fonts), then translates it with the model you choose. You pick the model during setup and can switch or download additional models from Settings at any time.

Reading: OCR

Engine	Default	Availability	Notes
WinRT OCR	✓	Windows 10 / 11, built into the OS	No extra download. Handles standard game text reliably.
OneOCR (Snipping Tool)	—	Windows 11 — uses the OCR engine bundled with the Snipping Tool app	Alternative engine for cases where WinRT OCR struggles. Switch in Settings → Engine.

For scenes where OCR struggles — stylized fonts, decorative text, handwritten styles — the LLM families below also support an image-recognition mode where the model reads the game screen directly instead of going through OCR.

Translation: LLM or NMT

For translation, Playto offers two distinct approaches you can switch between:

Approach	Strengths	Trade-offs
LLM	Rich, context-aware translation. Supports image recognition for stylized fonts. Works across language pairs.	Larger download (~3 GB). Heavier on VRAM.
NMT	Lightweight (hundreds of MB), faster on low-end hardware, minimal VRAM footprint.	Only specific language pairs are available. Text-only — no image recognition.

Neither is strictly better; the right choice depends on your hardware and the language pair you're translating between.

How you choose

On first launch, the Setup wizard presents two paths:

Recommended — Playto picks the best fit for your selected language pair (NMT if it's supported, otherwise the matching LLM family).
Custom — choose from cards showing each model with size, VRAM requirement, and what it's good for.

After setup, you can switch models or download additional ones any time from Settings → Engine. Multiple models can coexist on disk; switching between downloaded models is instant.

Model families

Family	Mode	Best for	Sizes
Qwen3-VL	LLM (vision-capable)	CJK source — Japanese, Chinese, Korean	2B / 4B (recommended) / 8B
Gemma 4	LLM (vision-capable)	Latin source — English, European	E2B (recommended) / E4B
Hy-MT2	LLM (text-only)	Multilingual translation, no image support	1.8B / 7B
Opus-MT	NMT	Specific pairs — JA↔EN, EN→ES/FR/IT/PT/DE/PL	~hundreds of MB per pair

Vision-capable families can read the game screen directly when OCR struggles — for example, when a game uses stylized fonts or decorative text.

Where models come from

All curated models are downloaded directly from Hugging Face — the public standard for open-source AI model hosting. Specific upstream sources:

Qwen3-VL — official Qwen repository (huggingface.co/Qwen/)
Gemma 4 — Google's original models, GGUF-quantized by the bartowski community (huggingface.co/bartowski/)
Hy-MT2 — Tencent's official repository (huggingface.co/tencent/)
Opus-MT — upstream from Helsinki-NLP, repackaged in CTranslate2 format under huggingface.co/playto-mt/

Exact download URLs are part of Playto's open manifest, visible in the application source code — there are no hidden downloads or proxies.

All curated models ship under permissive open licenses:

Hy-MT2 — Apache 2.0
Qwen3-VL — Apache 2.0
Gemma 4 — Apache 2.0
Opus-MT — CC-BY 4.0

Because models run on your device for personal use, you stay within the scope these licenses cover. The bundled license texts are viewable from Settings → Support → Third-party licenses in the app.

After download, models live entirely on your device. By default, nothing about your gameplay leaves your computer:

No internet connection is needed during translation.
No screenshot, OCR text, or game-screen content is sent to any external translation backend.
Translation continues to work offline.

The one exception is the optional AI Assistant (MCP) feature — if you connect Claude or another MCP client to Playto, your learning data is shared with that assistant by your own action. Translation itself remains local.

VRAM guide

Available VRAM	Suggested models
~2 GB	NMT (Opus-MT) or Hy-MT2 1.8B — minimal footprint.
~4 GB	Qwen3-VL 4B or Gemma 4 E2B — full LLM translation, works alongside most games.
6–8 GB	Qwen3-VL 8B or Gemma 4 E4B for demanding scenes; Hy-MT2 7B for multilingual MT.
12 GB+	Any model. Game and translation run comfortably side by side.

The numbers above are VRAM available in addition to what your game needs. Playto auto-fits GPU Layers based on available VRAM, but you can override in Settings if needed.