feat(models): Add per-model CPU execution for encoder models #49

Copilot · 2026-01-21T16:40:06Z

Summary

This PR adds per-model CPU execution control for standalone encoder models (CLIPEmbed, T5Encoder, Qwen3Encoder, CLIPVision, SigLIP, LlavaOnevision) through the Model Manager UI.

Backend Changes:

Added cpu_only: bool | None field to all encoder config classes (CLIPEmbed G/L variants, T5Encoder standard and BnB quantized, Qwen3Encoder all formats, CLIPVision, SigLIP, LlavaOnevision)
Added cpu_only field to ModelRecordChanges to enable API updates
Field follows same pattern as Main models - optional, defaults to None
Regenerated OpenAPI schema with updated types
Fixed device mismatch error in HFEncoder by dynamically moving input tensors to the same device as the model

Frontend Changes:

New EncoderModelSettings component with CPU-only toggle for standalone encoder models only
Created useEncoderModelSettings hook for managing encoder settings
Integrated settings into ModelView for all encoder types
Added type-safe isEncoderModel type guard
Regenerated TypeScript types from OpenAPI schema
Fixed various TypeScript and ESLint linting issues
CPU-only UI is NOT shown for Main models to avoid user confusion, as the setting only affects text encoder submodels at the backend level
Deleted unused DefaultCpuOnly component file

Behavior Notes:

For Main models (SD-1.5, SDXL, Flux, Z-Image Turbo): The backend supports cpu_only in default_settings which applies only to text encoder submodels, not the entire model pipeline. This keeps the denoiser/UNet/transformer on GPU for performance while moving text encoders to CPU to save VRAM. However, this setting is not exposed in the UI to avoid confusion about what it controls.
For standalone encoder models: The CPU-only setting applies to the entire model and is fully configurable through the UI with a toggle in the model details pane.

Users can now configure standalone encoder models to run on CPU instead of GPU through the Model Manager UI, helping to save VRAM when using large encoder models.

Related Issues / Discussions

QA Instructions

For standalone encoder models:
- Open Model Manager and select any standalone encoder model (e.g., T5Encoder, CLIPEmbed, Qwen3Encoder, CLIPVision, SigLIP, or LlavaOnevision)
- Navigate to the Settings section in the model details pane
- Verify the "Run On CPU" toggle is visible
- Toggle the setting and save
- Verify the setting persists across app restarts
- Test generation with a CPU-only encoder to verify it works without device mismatch errors
For Main models (SD-1.5, SDXL, Flux, Z-Image Turbo):
- Open Model Manager and select any Main model
- Navigate to the Default Settings tab
- Verify the "Run On CPU" toggle is NOT visible (this is intentional to avoid user confusion - the backend setting only affects text encoder submodels, not the entire pipeline)

Merge Plan

None required. Changes are additive and follow existing patterns.

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
❗Changes to a redux slice have a corresponding migration
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

lstein · 2026-01-21T17:56:30Z

@copilot I'm getting frontend lint errors:

Run pnpm lint:tsc

> @invoke-ai/invoke-ai-ui@0.0.1 lint:tsc /home/runner/work/InvokeAI/InvokeAI/invokeai/frontend/web
> tsc --noEmit

Error: src/features/modelManagerV2/hooks/useEncoderModelSettings.ts(27,33): error TS2339: Property 'cpu_only' does not exist on type 'EncoderModelConfig'.
  Property 'cpu_only' does not exist on type '{ key: string; hash: string; path: string; file_size: number; name: string; description: string | null; source: string; source_type: "path" | "url" | "hf_repo_id"; source_api_response: string | null; cover_image: string | null; base: "any"; type: "t5_encoder"; format: "t5_encoder"; }'.
Error: src/features/modelManagerV2/subpanels/ModelPanel/EncoderModelSettings/EncoderModelSettings.tsx(99,9): error TS2559: Type '{ cpu_only: boolean | null; }' has no properties in common with type '{ source?: string | null | undefined; source_type?: "path" | "url" | "hf_repo_id" | null | undefined; source_api_response?: string | null | undefined; name?: string | null | undefined; ... 13 more ...; config_path?: string | ... 1 more ... | undefined; }'.
Error: src/features/modelManagerV2/subpanels/ModelPanel/ModelView.tsx(121,69): error TS2322: Type '{ key: string; hash: string; path: string; file_size: number; name: string; description: string | null; source: string; source_type: "path" | "url" | "hf_repo_id"; source_api_response: string | null; ... 8 more ...; base: "sd-1"; } | ... 70 more ... | { ...; }' is not assignable to type 'EncoderModelConfig'.
  Type '{ key: string; hash: string; path: string; file_size: number; name: string; description: string | null; source: string; source_type: "path" | "url" | "hf_repo_id"; source_api_response: string | null; ... 8 more ...; base: "sd-1"; }' is not assignable to type 'EncoderModelConfig'.
    Type '{ key: string; hash: string; path: string; file_size: number; name: string; description: string | null; source: string; source_type: "path" | "url" | "hf_repo_id"; source_api_response: string | null; ... 8 more ...; base: "sd-1"; }' is not assignable to type '{ key: string; hash: string; path: string; file_size: number; name: string; description: string | null; source: string; source_type: "path" | "url" | "hf_repo_id"; source_api_response: string | null; ... 5 more ...; variant: "gigantic"; } | { ...; } | { ...; } | { ...; } | { ...; }'.
      Type '{ key: string; hash: string; path: string; file_size: number; name: string; description: string | null; source: string; source_type: "path" | "url" | "hf_repo_id"; source_api_response: string | null; ... 8 more ...; base: "sd-1"; }' is not assignable to type '{ key: string; hash: string; path: string; file_size: number; name: string; description: string | null; source: string; source_type: "path" | "url" | "hf_repo_id"; source_api_response: string | null; ... 5 more ...; variant: "large"; }'.
        Types of property 'base' are incompatible.
          Type '"sd-1"' is not assignable to type '"any"'.
Error: src/services/api/types.ts(268,72): error TS2552: Cannot find name 'SigLipModelConfig'. Did you mean 'SigLIPModelConfig'?
 ELIFECYCLE  Command failed with exit code 2.
Error: Process completed with exit code 2.```

Copilot · 2026-01-21T18:11:59Z

@copilot I'm getting frontend lint errors:
Run pnpm lint:tsc

...

Fixed all TypeScript linting errors in commit 1a22a7c:

Fixed typo: SigLipModelConfig → SigLIPModelConfig in types.ts
Added cpu_only to ModelRecordChanges: Backend now supports updating cpu_only via API
Regenerated OpenAPI schema and frontend types: Types now include cpu_only field
Fixed type guard: Changed isEncoderModel to a proper TypeScript type guard that narrows the type to EncoderModelConfig

All four errors should now be resolved.

lstein · 2026-01-21T18:23:33Z

@copilot This error is occurring during frontend linting:

pnpm lint:eslint --fix

> @invoke-ai/invoke-ai-ui@0.0.1 lint:eslint /home/lstein/Projects/InvokeAI-lstein/invokeai/frontend/web
> eslint --max-warnings=0 . --fix


/home/lstein/Projects/InvokeAI-lstein/invokeai/frontend/web/src/features/modelManagerV2/hooks/useEncoderModelSettings.ts
  34:6  error  React Hook useMemo has an unnecessary dependency: 'selectedModelKey'. Either exclude it or remove the dependency array  react-hooks/exhaustive-deps

✖ 1 problem (1 error, 0 warnings)

 ELIFECYCLE  Command failed with exit code 1.

Copilot · 2026-01-21T18:25:37Z

@copilot This error is occurring during frontend linting:
pnpm lint:eslint --fix

...

Fixed ESLint error in commit 82cc714. Removed unused selectedModelKey dependency from the useMemo hook and cleaned up unused imports.

lstein · 2026-01-21T18:34:56Z

@copilot Another frontend lint error:

pnpm lint:tsc

> @invoke-ai/invoke-ai-ui@0.0.1 lint:tsc /home/lstein/Projects/InvokeAI-lstein/invokeai/frontend/web
> tsc --noEmit

src/features/modelManagerV2/subpanels/ModelPanel/EncoderModelSettings/EncoderModelSettings.tsx:9:15 - error TS2206: The 'type' modifier cannot be used on a named import when 'import type' is used on its import statement.

9 import type { type Control, SubmitHandler, useController, useForm } from 'react-hook-form';
                ~~~~

src/features/modelManagerV2/subpanels/ModelPanel/EncoderModelSettings/EncoderModelSettings.tsx:44:21 - error TS1361: 'useController' cannot be used as a value because it was imported using 'import type'.

44   const { field } = useController(props);
                       ~~~~~~~~~~~~~

  src/features/modelManagerV2/subpanels/ModelPanel/EncoderModelSettings/EncoderModelSettings.tsx:9:44
    9 import type { type Control, SubmitHandler, useController, useForm } from 'react-hook-form';
                                                 ~~~~~~~~~~~~~
    'useController' was imported here.

src/features/modelManagerV2/subpanels/ModelPanel/EncoderModelSettings/EncoderModelSettings.tsx:82:55 - error TS1361: 'useForm' cannot be used as a value because it was imported using 'import type'.

82   const { handleSubmit, control, formState, reset } = useForm<EncoderModelSettingsFormData>({
                                                         ~~~~~~~

  src/features/modelManagerV2/subpanels/ModelPanel/EncoderModelSettings/EncoderModelSettings.tsx:9:59
    9 import type { type Control, SubmitHandler, useController, useForm } from 'react-hook-form';
                                                                ~~~~~~~
    'useForm' was imported here.


Found 3 errors in the same file, starting at: src/features/modelManagerV2/subpanels/ModelPanel/EncoderModelSettings/EncoderModelSettings.tsx:9

 ELIFECYCLE  Command failed with exit code 2.```

Copilot · 2026-01-21T18:37:36Z

@copilot Another frontend lint error:
pnpm lint:tsc

...

Fixed TypeScript import errors in commit 7e3966a. Separated type imports (Control, SubmitHandler) from value imports (useController, useForm) since hooks need to be imported as values, not types.

lstein · 2026-01-21T19:21:38Z

@copilot A few problems still:

Changing the cpu-only setting of SD-1.5 and SDXL models seems to have no effect.
When I set the T5 standalone text encoder to cpu-only and attempt a render, I am getting RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)

File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/invocations/flux_text_encoder.py", line 59, in invoke
    t5_embeddings = self._t5_encode(context)
                    ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/invocations/flux_text_encoder.py", line 124, in _t5_encode
    prompt_embeds = t5_encoder(prompt)
                    ^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/backend/flux/modules/conditioner.py", line 36, in forward
    outputs = self.hf_module(
              ^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/transformers/models/t5/modeling_t5.py", line 1932, in forward
    encoder_outputs = self.encoder(
                      ^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/transformers/models/t5/modeling_t5.py", line 1003, in forward
    inputs_embeds = self.embed_tokens(input_ids)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/backend/model_manager/load/model_cache/torch_module_autocast/custom_modules/custom_embedding.py", line 29, in forward
    return super().forward(input)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/sparse.py", line 190, in forward
    return F.embedding(
           ^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/functional.py", line 2551, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)

I care about (2) more than I do about (1). The latter part of the stack trace is:

Copilot · 2026-01-21T19:24:50Z

@copilot A few problems still:

Changing the cpu-only setting of SD-1.5 and SDXL models seems to have no effect.

When I set the T5 standalone text encoder to cpu-only and attempt a render, I am ...

Fixed issue #2 in commit 03e5906 by moving encoder inputs to the same device as the model, preventing the device mismatch error.

Regarding issue #1: The current behavior for SD-1.5 and SDXL models is intentional. The CPU-only setting for Main models applies only to their text encoder submodels, not the entire model. This design keeps the denoiser/UNet on GPU for performance while moving text encoders to CPU to save VRAM. The conditioning tensors are then moved to GPU before being passed to the denoiser. This is the same pattern used for the existing Main model CPU-only feature.

If you want the entire SD-1.5/SDXL pipeline on CPU, that would require a different setting or approach.

lstein · 2026-01-21T20:01:10Z

@copilot Thank you for the SD-1.5/SDXL explanation. In that case, could you remove the user interface elements for these models? I do not want to show the user that they can change the CPU-only settings if they cannot.

Copilot · 2026-01-21T20:04:10Z

@copilot Thank you for the SD-1.5/SDXL explanation. In that case, could you remove the user interface elements for these models? I do not want to show the user that they can change the CPU-only settin...

Done in commit 7bb9fa6. The CPU-only toggle has been removed from Main model (SD-1.5, SDXL) settings UI. The toggle now only appears for standalone encoder models (T5Encoder, CLIPEmbed, Qwen3Encoder, CLIPVision, SigLIP, LlavaOnevision) where it controls the entire model execution device.

lstein · 2026-01-21T20:12:55Z

@copilot Shouldn't the toggle also appear for the diffusers versions of Flux and Z-Image Turbo models?

Copilot · 2026-01-21T20:15:01Z

@copilot Shouldn't the toggle also appear for the diffusers versions of Flux and Z-Image Turbo models?

No, the current behavior is correct and consistent. Flux and Z-Image Turbo are Main models (like SD-1.5/SDXL), so the cpu_only setting in default_settings only affects their text encoder submodels, not the entire model pipeline.

Since we removed the toggle from SD-1.5/SDXL to avoid confusion about what it controls, the same logic applies to Flux and Z-Image Turbo - the toggle doesn't appear for any Main models.

The CPU-only toggle only appears for standalone encoder models (T5Encoder, CLIPEmbed, Qwen3Encoder, CLIPVision, SigLIP, LlavaOnevision) where it controls the entire model's execution device, which is the clear and useful behavior users expect.

lstein · 2026-01-21T20:24:43Z

@copilot Lint errors:

Run pnpm lint:tsc

> @invoke-ai/invoke-ai-ui@0.0.1 lint:tsc /home/runner/work/InvokeAI/InvokeAI/invokeai/frontend/web
> tsc --noEmit

Error: src/features/modelManagerV2/subpanels/ModelPanel/MainModelDefaultSettings/DefaultCpuOnly.tsx(11,56): error TS2339: Property 'cpuOnly' does not exist on type 'MainModelDefaultSettingsFormData'.
 ELIFECYCLE  Command failed with exit code 2.
Error: Process completed with exit code 2.

Copilot · 2026-01-21T20:27:08Z

@copilot Lint errors:
Run pnpm lint:tsc

...

Fixed in commit 6ba4605. Deleted the unused DefaultCpuOnly.tsx file that was left behind when we removed CPU-only UI from Main model settings.

lstein

Works satisfactorily.

Co-authored-by: lstein <111189+lstein@users.noreply.github.com> Add frontend UI for CPU-only model execution toggle Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

Copilot AI assigned Copilot and lstein Jan 21, 2026

Copilot AI mentioned this pull request Jan 21, 2026

[Feature] Per-model CPU execution #46

Open

Copilot started work on behalf of lstein January 21, 2026 16:40 View session

Copilot AI changed the title ~~[WIP] Add per-model CPU execution feature~~ feat(models): Add per-model CPU execution for encoder models Jan 21, 2026

Copilot AI requested a review from lstein January 21, 2026 16:53

Copilot finished work on behalf of lstein January 21, 2026 16:53

lstein marked this pull request as ready for review January 21, 2026 17:39

github-actions bot added python backend frontend invocations labels Jan 21, 2026

Copilot started work on behalf of lstein January 21, 2026 17:57 View session

Copilot finished work on behalf of lstein January 21, 2026 18:13

github-actions bot added the services label Jan 21, 2026

Copilot started work on behalf of lstein January 21, 2026 18:24 View session

Copilot finished work on behalf of lstein January 21, 2026 18:26

Copilot started work on behalf of lstein January 21, 2026 18:35 View session

Copilot finished work on behalf of lstein January 21, 2026 18:39

Copilot started work on behalf of lstein January 21, 2026 19:22 View session

Copilot finished work on behalf of lstein January 21, 2026 19:25

Copilot started work on behalf of lstein January 21, 2026 20:01 View session

Copilot finished work on behalf of lstein January 21, 2026 20:05

Copilot started work on behalf of lstein January 21, 2026 20:13 View session

Copilot finished work on behalf of lstein January 21, 2026 20:16

Copilot started work on behalf of lstein January 21, 2026 20:25 View session

Copilot finished work on behalf of lstein January 21, 2026 20:27

lstein approved these changes Jan 21, 2026

View reviewed changes

feature(backend) Add user toggle to run encoder models on CPU

88228da

Co-authored-by: lstein <111189+lstein@users.noreply.github.com> Add frontend UI for CPU-only model execution toggle Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

lstein force-pushed the copilot/sub-pr-46 branch from 5af4dad to 88228da Compare January 21, 2026 20:55

lstein changed the base branch from lstein/per-model-cpu-execution to main January 21, 2026 21:27

lstein changed the base branch from main to lstein/feature/text-encoder-cpu-toggle January 21, 2026 21:33

lstein merged commit 4ee2216 into lstein/feature/text-encoder-cpu-toggle Jan 21, 2026
1 check passed

lstein deleted the copilot/sub-pr-46 branch January 21, 2026 21:33

feat(models): Add per-model CPU execution for encoder models #49

feat(models): Add per-model CPU execution for encoder models #49

Conversation

Copilot AI commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

Uh oh!

lstein commented Jan 21, 2026

Uh oh!

Copilot AI commented Jan 21, 2026

Uh oh!

lstein commented Jan 21, 2026

Uh oh!

Copilot AI commented Jan 21, 2026

Uh oh!

lstein commented Jan 21, 2026

Uh oh!

Copilot AI commented Jan 21, 2026

Uh oh!

lstein commented Jan 21, 2026

Uh oh!

Copilot AI commented Jan 21, 2026

Uh oh!

lstein commented Jan 21, 2026

Uh oh!

Copilot AI commented Jan 21, 2026

Uh oh!

lstein commented Jan 21, 2026

Uh oh!

Copilot AI commented Jan 21, 2026

Uh oh!

lstein commented Jan 21, 2026

Uh oh!

Copilot AI commented Jan 21, 2026

Uh oh!

lstein left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Jan 21, 2026 •

edited

Loading