# Dictato

> Context-aware voice dictation for macOS

This document contains the full content of all documentation pages for AI consumption.

---

## Audio

**URL:** https://dictato-docs.vercel.app/docs/audio
**Description:** Microphone, sounds, and media behavior

Audio settings live in **Settings → Audio**.

![Audio settings](/screenshots/audio.png)

## Microphone mode

- **Automatic** — follows the system default input. Changes when you plug in headphones or a USB mic.
- **On-device** — always uses the built-in Mac microphone, even if another input is default.
- **Custom** — always uses a specific mic by UID. If that mic isn't plugged in, Dictato falls back to on-device.

A live level meter shows whatever mic is currently active, so you can verify Dictato is hearing you before you start.

## Pause media while recording

On by default. Dictato pauses system media playback (Music, Spotify, Safari video, etc.) when you start dictating and resumes after.

## Middle-click to record

When enabled, a middle-click starts and stops a hands-free recording. Handy if you use a three-button mouse. See [Dictation](/docs/dictation) for other ways to start a recording.

## Sound profile

Pick the audible tones Dictato plays when recording starts, stops, and locks. Choose **None** to go silent.

---

## Context capture

**URL:** https://dictato-docs.vercel.app/docs/context
**Description:** Give Dictato extra context about what you're doing

Dictato can pull extra context from your screen and clipboard so the transcription understands *what* you're talking about — not just the words. All capture toggles live in **Settings → Transcription**.

![Context capture settings](/screenshots/transcription.png)

## What can be captured

- **Use screen text as context** — runs OCR on what's visible and feeds it to the transcription or post-processing prompt.
- **Capture text selection** — grabs whatever is currently highlighted in another app.
- **Capture copy (⌘C)** — remembers your most recent clipboard copy.
- **Capture paste (⌘V)** — remembers what you last pasted.
- **Capture app switches** — notes when you change active app and window during the session.
- **Circle gesture screenshots** — the [circle gesture](/docs/tools#circle-gesture-screenshot) detector.

## Vocabulary

Add brand names, jargon, or oddly-spelled words in **Vocabulary** so the transcription engine knows they exist. Comma- or line-separated — both work.

Example: `Dictato, Claude, WhisperKit, Parakeet, Superset`

## Custom dictation prompt

A short free-form prompt that biases the transcription. Use it to describe your speaking style, your domain, or conventions you want Dictato to respect.

---

## Dictation

**URL:** https://dictato-docs.vercel.app/docs/dictation
**Description:** Push-to-talk, hands-free lock, and stopping a recording

Dictato has two recording modes: a quick push-to-talk for short notes, and a locked hands-free mode for longer sessions.

## Push-to-talk

Hold **Fn** and speak. Release to transcribe and paste wherever your cursor is.

## Hands-free lock

For longer dictations, **press Fn + Space** while holding Fn. The recording locks and you can let go of the keyboard.

While locked, you can use your hands for anything: open another app, grab a screenshot, draw on screen. The recording keeps going.

## Stopping a locked recording

Any of these will stop and paste:

- **⌘ Escape**
- **Middle-click** (if [middle-click recording](/docs/audio) is enabled)
- Click the **stop** button in the overlay

## Middle-click toggle

Enable **Settings → Audio → Middle mouse button to record** to start and stop hands-free recording with a middle-click. Useful if you have a three-button mouse and don't love reaching for Fn.

## Pause media while recording

By default, Dictato pauses whatever's playing (music, video) while you dictate and resumes after. Toggle this in **Settings → Audio**.

## Shortcuts reference

| Action | Shortcut |
|---|---|
| Push-to-talk | Hold **Fn** |
| Lock recording (hands-free) | **Fn + Space** |
| Stop locked recording | **⌘ Escape** |
| Toggle hands-free (optional) | **Middle-click** |
| Region screenshot | **⇧⌘4** |
| Draw on screen | Hold **⌥⌘** |
| Add text quote | **⌥T** |
| Exit active tool | **Escape** |

---

## Folders

**URL:** https://dictato-docs.vercel.app/docs/folders
**Description:** Where Dictato stores your transcriptions and config

Dictato writes everything to regular folders on your Mac — nothing is locked in a proprietary database. Settings are in **Settings → Folders**.

![Folders settings](/screenshots/folders.png)

## Markdown folder

Each finished dictation is saved as a Markdown file here. Attachments (screenshots, drawings, audio) live in a `attachments/` sibling folder. Change the location with **Choose…** or click **Reveal** to open it in Finder.

### What a dictation file looks like

A single dictation with a screenshot and saved audio ends up as something like this:

```markdown
---
date: 2026-04-11T14:26:10Z
source_app: Superset
duration: 18.3
language: en
title: Core product framing for investor deck
audio: attachments/2026-04-11-14-26-10-a1b2c3.m4a
---

# Core product framing for investor deck

Let's try and think about this. So our core product is converting
models, right? You put a model in, you're gonna model out. We transfer
between formats, that's the engine. Everything else — the web tools,
the plugins, the API — is just a shell around that engine.

So for the deck, lead with "universal 3D format conversion", then
show the engine diagram on slide two.

[^A]: ![Screenshot A](attachments/2026-04-11-14-26-10-a1b2c3-A.png)
```

The filename follows `YYYY-MM-DD HH-MM-SS

---

## History

**URL:** https://dictato-docs.vercel.app/docs/history
**Description:** Browse, search, and re-use past dictations

Every dictation is saved locally. Open the main Dictato window to browse them.

![History window](/screenshots/history.png)

## What you get

- **Title** — auto-generated if [post-processing](/docs/post-processing) title extraction is on.
- **Text** — the final transcription (raw or post-processed).
- **Source app** — which app you were dictating into.
- **Duration** — how long you recorded.
- **Attachments** — screenshots, drawings, or quoted text attached to the session.

## Search

Use the search field at the top to find a dictation by its text. The sidebar also groups entries by the app you were using — click an app to filter.

## Re-using a dictation

Each row has a **Copy** button to put the text back on your clipboard.

## Opening a dictation

Double-click any row to open the full detail view. Screenshots and drawings from the session appear inline, exactly where you captured them in the flow of the transcription. The toolbar has a **Copy** and **Share** button for the whole thing.

![Opened dictation with inline screenshots](/screenshots/history-detail.png)

## Where is it stored?

Transcriptions live in your **Markdown folder**, one file per dictation. See [Folders](/docs/folders) to change the location or turn audio saving on/off.

---

## Hooks

**URL:** https://dictato-docs.vercel.app/docs/hooks
**Description:** Run shell commands, webhooks, or Claude prompts on Dictato events

Hooks let you react to what Dictato is doing — transform the transcription before it pastes, log events to another tool, block a paste into a specific app, or anything else you can script.

![Hooks settings](/screenshots/hooks.png)

## Configure

Create `hooks.json` in your [config folder](/docs/folders). Each key is an event name mapping to an array of matcher groups:

```json
{
  "TranscriptionComplete": [
    {
      "matcher": "en",
      "hooks": [
        { "type": "command", "command": "hooks/spellcheck.sh", "timeout": 10 }
      ]
    }
  ]
}
```

Relative command paths resolve from the config folder.

## Hook types

### `command`

Runs a shell command. Receives the event JSON on stdin. Optionally writes replacement JSON to stdout.

```json
{ "type": "command", "command": "hooks/my-script.sh", "timeout": 10 }
```

### `prompt`

Sends the event payload plus your prompt to the Claude API (reuses the key from [Post-processing](/docs/post-processing)). The model's response becomes the new text.

```json
{ "type": "prompt", "prompt": "Fix spelling and grammar. Return only the corrected text.", "timeout": 30 }
```

### `webhook`

HTTP POST with the JSON payload as the body.

- **200** — success; if the response is `{"text": "..."}` it replaces the current text
- **204** — success, no changes
- **422** — block the operation (for blockable events)
- **other** — logged as error

```json
{ "type": "webhook", "url": "https://example.com/hooks/dictato", "timeout": 15 }
```

## Events

| Event | Matcher field | Can block? | Payload |
|---|---|---|---|
| `SessionStart` | — | — | — |
| `RecordingStart` | `sourceApp` | yes | `sourceApp`, `sourceAppBundleId` |
| `RecordingLocked` | — | — | `duration` |
| `RecordingStop` | — | — | `duration`, `sampleCount` |
| `ScreenshotCaptured` | `sourceApp` | — | `label`, `timestamp`, `sourceApp` |
| `QuoteCaptured` | `sourceApp` | — | `label`, `text`, `sourceApp`, `sourceWindow`, `timestamp` |
| `TranscriptionStart` | `model` | — | `engine`, `model`, `languages` |
| `TranscriptionComplete` | `language` | yes | `text`, `language`, `duration` |
| `BeforePaste` | `targetApp` | yes | `text`, `targetApp`, `targetAppBundleId`, `hasAttachments` |
| `AfterPaste` | `targetApp` | — | `text`, `targetApp` |

## Protocol

- **Input (`command`)** — JSON on stdin with `event`, payload keys, and `configFolder`
- **Input (`webhook`)** — HTTP POST with the same JSON as body
- **Input (`prompt`)** — payload is sent as context alongside your prompt to Claude
- **Output** — JSON with `{"text": "..."}` replaces the current text. No output = no change.
- **Exit codes (`command`)** — `0` success, `2` block (blockable events only), anything else = error
- Hooks with matching events run in parallel.

## Browse examples

The **Examples** picker in the Hooks settings pane shows ready-to-copy scripts living in your config folder. Use them as a starting point and edit in place.

---

## Welcome to Dictato

**URL:** https://dictato-docs.vercel.app/docs
**Description:** Context-aware voice dictation for macOS

**Dictato** is a local, privacy-friendly dictation app for macOS. Hold **Fn** to dictate, release to paste — but Dictato also lets you attach screenshots, drawings, and quoted text to the same dictation so you can talk *about* what's on screen.

  
  
  

## Why Dictato?

- **Local first.** Transcription runs on-device with Parakeet or WhisperKit. Nothing leaves your Mac unless you enable cloud post-processing.
- **More than speech.** Combine voice with screenshots, drawings, and quoted text in a single dictation.
- **Context-aware.** Optionally attach on-screen text, the active app, and your recent selection to every transcription.
- **Scriptable.** Hooks let you run shell commands, webhooks, or Claude prompts on events like `BeforePaste` or `TranscriptionComplete`.

---

## Languages

**URL:** https://dictato-docs.vercel.app/docs/languages
**Description:** Pick your dictation languages

Dictato auto-detects language on every dictation, but you can narrow it down in **Settings → Languages** for faster, more accurate results.

![Languages settings](/screenshots/languages.png)

## How it works

- **No languages selected** — Dictato auto-detects from every language the model supports.
- **One language** — all dictation is forced to that language. Best speed and accuracy if you only speak one.
- **Two or more** — Dictato auto-detects between the languages you've picked. If it can't tell, it falls back to the **Primary** (the first one in the list).

Add a language from the **Add Language** picker. Remove with the red minus button.

## Model support

Language availability depends on your [transcription model](/docs/models). Parakeet V3 covers 25 languages; Parakeet V2 is English-only; WhisperKit Medium/Base covers 100+.

---

## Transcription models

**URL:** https://dictato-docs.vercel.app/docs/models
**Description:** Pick the on-device speech model that fits your Mac

Dictato runs two model families, both entirely on-device. Pick one in **Settings → Model** — unselected models download on demand.

![Model settings](/screenshots/models.png)

## Parakeet (recommended)

NVIDIA's Parakeet family, running through Apple Neural Engine. Fastest on modern Apple Silicon.

| Model | Size | Best for |
|---|---|---|
| **Parakeet V3** | ~600 MB | Multilingual (25 languages), ~210× real-time on M4 |
| **Parakeet V2** | ~400 MB | English only, highest recall |

## WhisperKit

OpenAI's Whisper models compiled for CoreML. Slower than Parakeet but more widely supported.

| Model | Size | Best for |
|---|---|---|
| **Base** | ~150 MB | Lightweight, good for most languages |
| **Distil Large V3** | ~600 MB | 6× faster than Large V3, within 1% accuracy |
| **Medium** | ~1.5 GB | Higher accuracy across languages |
| **Medium (English)** | ~1.5 GB | Higher accuracy for English only |

## Which should I pick?

- **English only, fast machine:** Parakeet V2
- **Multilingual:** Parakeet V3
- **Older Mac or wide language support:** WhisperKit Base or Medium
- **Best accuracy, don't mind size:** WhisperKit Medium or Distil Large V3

---

## Post-processing

**URL:** https://dictato-docs.vercel.app/docs/post-processing
**Description:** Clean up transcriptions with on-device or Claude AI

Post-processing runs the raw transcription through an LLM to fix punctuation, fix obvious typos, and apply your own rules. Enable it in **Settings → Post-processing**.

![Post-processing settings](/screenshots/post-processing.png)

## Method

- **On Device** — runs locally on macOS 26 or later. No API key, no cloud.
- **Anthropic** — sends the transcription to the Claude API. Paste your API key in the settings pane and choose a model (Haiku, Sonnet, Opus).

## Generate title

When enabled, Dictato asks the model to write a short title for each transcription. Titles show up in [History](/docs/history) so you can find a dictation later without re-reading it.

## Prompt template

The prompt you send to the model is fully editable. It supports a few placeholders that Dictato fills in at runtime:

| Placeholder | Filled with |
|---|---|
| `{{transcription}}` | The raw transcription text |
| `{{vocabulary}}` | Your custom vocabulary from Settings → Transcription |
| `{{language}}` | The detected language |
| `{{app}}` | The app you were focused on when dictating |

![Prompt template editor](/screenshots/post-processing-prompt.png)

Click **Reset to Default** to go back to the built-in prompt.

---

## Quick start

**URL:** https://dictato-docs.vercel.app/docs/quickstart
**Description:** Install Dictato and record your first dictation

## Next steps

---

## Tools during recording

**URL:** https://dictato-docs.vercel.app/docs/tools
**Description:** Screenshots, drawings, quoted text, and drag-and-drop attachments

While you're dictating (especially in hands-free lock), you can attach things to the same recording. Everything you attach becomes part of the transcription's context and history.

## Region screenshot

**⇧⌘4** — grab a region of the screen. The screenshot animates into the overlay and attaches to the current recording.

## Draw on screen

Hold **⌥⌘** to turn the screen into a canvas. Sketch arrows or annotations anywhere, release to capture. The drawing is saved as an image attachment.

## Circle gesture screenshot

Rough-circle something on screen with your mouse three times and Dictato grabs the region automatically. No shortcut, no modifier keys — just spin the cursor.

- First two loops arm the gesture
- Third loop captures the circled area as a screenshot
- Move away from the circled area (or press **Escape**) to cancel

Toggle in **Settings → Transcription → Circle gesture screenshots**.

## Text quotes

Select text anywhere on your screen, then press **⌥T** to attach it as a quoted snippet. Handy when you want to say "rewrite *this* paragraph" without re-typing or switching windows.

## Drag and drop

Drag anything onto the Dictato overlay to attach it:

- **Files** — added as file attachments
- **Images** — added as image attachments
- **URLs** — added as links
- **Text** — added as a text snippet
- **Colors** — attached as a color swatch

## Exiting a tool

Press **Escape** to exit the active tool without stopping the recording.

## Shortcut reference

| Tool | Shortcut |
|---|---|
| Region screenshot | **⇧⌘4** |
| Draw on screen | Hold **⌥⌘** |
| Text quote | **⌥T** |
| Circle gesture | Draw 3 circles |
| Exit tool | **Escape** |

![Shortcuts settings in Dictato](/screenshots/shortcuts.png)

---

## Links

- [Feedback](https://tally.so/r/PdX8o5)