Skip to content

Conversation

Copy link

Copilot AI commented Jan 21, 2026

Summary

This PR adds per-model CPU execution control for standalone encoder models (CLIPEmbed, T5Encoder, Qwen3Encoder, CLIPVision, SigLIP, LlavaOnevision) through the Model Manager UI.

Backend Changes:

  • Added cpu_only: bool | None field to all encoder config classes (CLIPEmbed G/L variants, T5Encoder standard and BnB quantized, Qwen3Encoder all formats, CLIPVision, SigLIP, LlavaOnevision)
  • Added cpu_only field to ModelRecordChanges to enable API updates
  • Field follows same pattern as Main models - optional, defaults to None
  • Regenerated OpenAPI schema with updated types
  • Fixed device mismatch error in HFEncoder by dynamically moving input tensors to the same device as the model

Frontend Changes:

  • New EncoderModelSettings component with CPU-only toggle for standalone encoder models only
  • Created useEncoderModelSettings hook for managing encoder settings
  • Integrated settings into ModelView for all encoder types
  • Added type-safe isEncoderModel type guard
  • Regenerated TypeScript types from OpenAPI schema
  • Fixed various TypeScript and ESLint linting issues
  • CPU-only UI is NOT shown for Main models to avoid user confusion, as the setting only affects text encoder submodels at the backend level
  • Deleted unused DefaultCpuOnly component file

Behavior Notes:

  • For Main models (SD-1.5, SDXL, Flux, Z-Image Turbo): The backend supports cpu_only in default_settings which applies only to text encoder submodels, not the entire model pipeline. This keeps the denoiser/UNet/transformer on GPU for performance while moving text encoders to CPU to save VRAM. However, this setting is not exposed in the UI to avoid confusion about what it controls.
  • For standalone encoder models: The CPU-only setting applies to the entire model and is fully configurable through the UI with a toggle in the model details pane.

Users can now configure standalone encoder models to run on CPU instead of GPU through the Model Manager UI, helping to save VRAM when using large encoder models.

Related Issues / Discussions

QA Instructions

  1. For standalone encoder models:

    • Open Model Manager and select any standalone encoder model (e.g., T5Encoder, CLIPEmbed, Qwen3Encoder, CLIPVision, SigLIP, or LlavaOnevision)
    • Navigate to the Settings section in the model details pane
    • Verify the "Run On CPU" toggle is visible
    • Toggle the setting and save
    • Verify the setting persists across app restarts
    • Test generation with a CPU-only encoder to verify it works without device mismatch errors
  2. For Main models (SD-1.5, SDXL, Flux, Z-Image Turbo):

    • Open Model Manager and select any Main model
    • Navigate to the Default Settings tab
    • Verify the "Run On CPU" toggle is NOT visible (this is intentional to avoid user confusion - the backend setting only affects text encoder submodels, not the entire pipeline)

Merge Plan

None required. Changes are additive and follow existing patterns.

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Add per-model CPU execution feature feat(models): Add per-model CPU execution for encoder models Jan 21, 2026
Copilot AI requested a review from lstein January 21, 2026 16:53
@lstein lstein marked this pull request as ready for review January 21, 2026 17:39
@lstein
Copy link
Owner

lstein commented Jan 21, 2026

@copilot I'm getting frontend lint errors:

Run pnpm lint:tsc

> @invoke-ai/invoke-ai-ui@0.0.1 lint:tsc /home/runner/work/InvokeAI/InvokeAI/invokeai/frontend/web
> tsc --noEmit

Error: src/features/modelManagerV2/hooks/useEncoderModelSettings.ts(27,33): error TS2339: Property 'cpu_only' does not exist on type 'EncoderModelConfig'.
  Property 'cpu_only' does not exist on type '{ key: string; hash: string; path: string; file_size: number; name: string; description: string | null; source: string; source_type: "path" | "url" | "hf_repo_id"; source_api_response: string | null; cover_image: string | null; base: "any"; type: "t5_encoder"; format: "t5_encoder"; }'.
Error: src/features/modelManagerV2/subpanels/ModelPanel/EncoderModelSettings/EncoderModelSettings.tsx(99,9): error TS2559: Type '{ cpu_only: boolean | null; }' has no properties in common with type '{ source?: string | null | undefined; source_type?: "path" | "url" | "hf_repo_id" | null | undefined; source_api_response?: string | null | undefined; name?: string | null | undefined; ... 13 more ...; config_path?: string | ... 1 more ... | undefined; }'.
Error: src/features/modelManagerV2/subpanels/ModelPanel/ModelView.tsx(121,69): error TS2322: Type '{ key: string; hash: string; path: string; file_size: number; name: string; description: string | null; source: string; source_type: "path" | "url" | "hf_repo_id"; source_api_response: string | null; ... 8 more ...; base: "sd-1"; } | ... 70 more ... | { ...; }' is not assignable to type 'EncoderModelConfig'.
  Type '{ key: string; hash: string; path: string; file_size: number; name: string; description: string | null; source: string; source_type: "path" | "url" | "hf_repo_id"; source_api_response: string | null; ... 8 more ...; base: "sd-1"; }' is not assignable to type 'EncoderModelConfig'.
    Type '{ key: string; hash: string; path: string; file_size: number; name: string; description: string | null; source: string; source_type: "path" | "url" | "hf_repo_id"; source_api_response: string | null; ... 8 more ...; base: "sd-1"; }' is not assignable to type '{ key: string; hash: string; path: string; file_size: number; name: string; description: string | null; source: string; source_type: "path" | "url" | "hf_repo_id"; source_api_response: string | null; ... 5 more ...; variant: "gigantic"; } | { ...; } | { ...; } | { ...; } | { ...; }'.
      Type '{ key: string; hash: string; path: string; file_size: number; name: string; description: string | null; source: string; source_type: "path" | "url" | "hf_repo_id"; source_api_response: string | null; ... 8 more ...; base: "sd-1"; }' is not assignable to type '{ key: string; hash: string; path: string; file_size: number; name: string; description: string | null; source: string; source_type: "path" | "url" | "hf_repo_id"; source_api_response: string | null; ... 5 more ...; variant: "large"; }'.
        Types of property 'base' are incompatible.
          Type '"sd-1"' is not assignable to type '"any"'.
Error: src/services/api/types.ts(268,72): error TS2552: Cannot find name 'SigLipModelConfig'. Did you mean 'SigLIPModelConfig'?
 ELIFECYCLE  Command failed with exit code 2.
Error: Process completed with exit code 2.```

Copy link
Author

Copilot AI commented Jan 21, 2026

@copilot I'm getting frontend lint errors:

Run pnpm lint:tsc

...

Fixed all TypeScript linting errors in commit 1a22a7c:

  1. Fixed typo: SigLipModelConfigSigLIPModelConfig in types.ts
  2. Added cpu_only to ModelRecordChanges: Backend now supports updating cpu_only via API
  3. Regenerated OpenAPI schema and frontend types: Types now include cpu_only field
  4. Fixed type guard: Changed isEncoderModel to a proper TypeScript type guard that narrows the type to EncoderModelConfig

All four errors should now be resolved.

@lstein
Copy link
Owner

lstein commented Jan 21, 2026

@copilot This error is occurring during frontend linting:

pnpm lint:eslint --fix

> @invoke-ai/invoke-ai-ui@0.0.1 lint:eslint /home/lstein/Projects/InvokeAI-lstein/invokeai/frontend/web
> eslint --max-warnings=0 . --fix


/home/lstein/Projects/InvokeAI-lstein/invokeai/frontend/web/src/features/modelManagerV2/hooks/useEncoderModelSettings.ts
  34:6  error  React Hook useMemo has an unnecessary dependency: 'selectedModelKey'. Either exclude it or remove the dependency array  react-hooks/exhaustive-deps

✖ 1 problem (1 error, 0 warnings)

 ELIFECYCLE  Command failed with exit code 1.

Copy link
Author

Copilot AI commented Jan 21, 2026

@copilot This error is occurring during frontend linting:

pnpm lint:eslint --fix

...

Fixed ESLint error in commit 82cc714. Removed unused selectedModelKey dependency from the useMemo hook and cleaned up unused imports.

@lstein
Copy link
Owner

lstein commented Jan 21, 2026

@copilot Another frontend lint error:

pnpm lint:tsc

> @invoke-ai/invoke-ai-ui@0.0.1 lint:tsc /home/lstein/Projects/InvokeAI-lstein/invokeai/frontend/web
> tsc --noEmit

src/features/modelManagerV2/subpanels/ModelPanel/EncoderModelSettings/EncoderModelSettings.tsx:9:15 - error TS2206: The 'type' modifier cannot be used on a named import when 'import type' is used on its import statement.

9 import type { type Control, SubmitHandler, useController, useForm } from 'react-hook-form';
                ~~~~

src/features/modelManagerV2/subpanels/ModelPanel/EncoderModelSettings/EncoderModelSettings.tsx:44:21 - error TS1361: 'useController' cannot be used as a value because it was imported using 'import type'.

44   const { field } = useController(props);
                       ~~~~~~~~~~~~~

  src/features/modelManagerV2/subpanels/ModelPanel/EncoderModelSettings/EncoderModelSettings.tsx:9:44
    9 import type { type Control, SubmitHandler, useController, useForm } from 'react-hook-form';
                                                 ~~~~~~~~~~~~~
    'useController' was imported here.

src/features/modelManagerV2/subpanels/ModelPanel/EncoderModelSettings/EncoderModelSettings.tsx:82:55 - error TS1361: 'useForm' cannot be used as a value because it was imported using 'import type'.

82   const { handleSubmit, control, formState, reset } = useForm<EncoderModelSettingsFormData>({
                                                         ~~~~~~~

  src/features/modelManagerV2/subpanels/ModelPanel/EncoderModelSettings/EncoderModelSettings.tsx:9:59
    9 import type { type Control, SubmitHandler, useController, useForm } from 'react-hook-form';
                                                                ~~~~~~~
    'useForm' was imported here.


Found 3 errors in the same file, starting at: src/features/modelManagerV2/subpanels/ModelPanel/EncoderModelSettings/EncoderModelSettings.tsx:9

 ELIFECYCLE  Command failed with exit code 2.```

Copy link
Author

Copilot AI commented Jan 21, 2026

@copilot Another frontend lint error:

pnpm lint:tsc

...

Fixed TypeScript import errors in commit 7e3966a. Separated type imports (Control, SubmitHandler) from value imports (useController, useForm) since hooks need to be imported as values, not types.

@lstein
Copy link
Owner

lstein commented Jan 21, 2026

@copilot A few problems still:

  1. Changing the cpu-only setting of SD-1.5 and SDXL models seems to have no effect.
  2. When I set the T5 standalone text encoder to cpu-only and attempt a render, I am getting RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)
File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/invocations/flux_text_encoder.py", line 59, in invoke
    t5_embeddings = self._t5_encode(context)
                    ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/invocations/flux_text_encoder.py", line 124, in _t5_encode
    prompt_embeds = t5_encoder(prompt)
                    ^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/backend/flux/modules/conditioner.py", line 36, in forward
    outputs = self.hf_module(
              ^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/transformers/models/t5/modeling_t5.py", line 1932, in forward
    encoder_outputs = self.encoder(
                      ^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/transformers/models/t5/modeling_t5.py", line 1003, in forward
    inputs_embeds = self.embed_tokens(input_ids)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/backend/model_manager/load/model_cache/torch_module_autocast/custom_modules/custom_embedding.py", line 29, in forward
    return super().forward(input)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/modules/sparse.py", line 190, in forward
    return F.embedding(
           ^^^^^^^^^^^^
  File "/home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/torch/nn/functional.py", line 2551, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)

I care about (2) more than I do about (1). The latter part of the stack trace is:

Copy link
Author

Copilot AI commented Jan 21, 2026

@copilot A few problems still:

  1. Changing the cpu-only setting of SD-1.5 and SDXL models seems to have no effect.
  2. When I set the T5 standalone text encoder to cpu-only and attempt a render, I am ...

Fixed issue #2 in commit 03e5906 by moving encoder inputs to the same device as the model, preventing the device mismatch error.

Regarding issue #1: The current behavior for SD-1.5 and SDXL models is intentional. The CPU-only setting for Main models applies only to their text encoder submodels, not the entire model. This design keeps the denoiser/UNet on GPU for performance while moving text encoders to CPU to save VRAM. The conditioning tensors are then moved to GPU before being passed to the denoiser. This is the same pattern used for the existing Main model CPU-only feature.

If you want the entire SD-1.5/SDXL pipeline on CPU, that would require a different setting or approach.

@lstein
Copy link
Owner

lstein commented Jan 21, 2026

@copilot Thank you for the SD-1.5/SDXL explanation. In that case, could you remove the user interface elements for these models? I do not want to show the user that they can change the CPU-only settings if they cannot.

Copy link
Author

Copilot AI commented Jan 21, 2026

@copilot Thank you for the SD-1.5/SDXL explanation. In that case, could you remove the user interface elements for these models? I do not want to show the user that they can change the CPU-only settin...

Done in commit 7bb9fa6. The CPU-only toggle has been removed from Main model (SD-1.5, SDXL) settings UI. The toggle now only appears for standalone encoder models (T5Encoder, CLIPEmbed, Qwen3Encoder, CLIPVision, SigLIP, LlavaOnevision) where it controls the entire model execution device.

@lstein
Copy link
Owner

lstein commented Jan 21, 2026

@copilot Shouldn't the toggle also appear for the diffusers versions of Flux and Z-Image Turbo models?

Copy link
Author

Copilot AI commented Jan 21, 2026

@copilot Shouldn't the toggle also appear for the diffusers versions of Flux and Z-Image Turbo models?

No, the current behavior is correct and consistent. Flux and Z-Image Turbo are Main models (like SD-1.5/SDXL), so the cpu_only setting in default_settings only affects their text encoder submodels, not the entire model pipeline.

Since we removed the toggle from SD-1.5/SDXL to avoid confusion about what it controls, the same logic applies to Flux and Z-Image Turbo - the toggle doesn't appear for any Main models.

The CPU-only toggle only appears for standalone encoder models (T5Encoder, CLIPEmbed, Qwen3Encoder, CLIPVision, SigLIP, LlavaOnevision) where it controls the entire model's execution device, which is the clear and useful behavior users expect.

@lstein
Copy link
Owner

lstein commented Jan 21, 2026

@copilot Lint errors:

Run pnpm lint:tsc

> @invoke-ai/invoke-ai-ui@0.0.1 lint:tsc /home/runner/work/InvokeAI/InvokeAI/invokeai/frontend/web
> tsc --noEmit

Error: src/features/modelManagerV2/subpanels/ModelPanel/MainModelDefaultSettings/DefaultCpuOnly.tsx(11,56): error TS2339: Property 'cpuOnly' does not exist on type 'MainModelDefaultSettingsFormData'.
 ELIFECYCLE  Command failed with exit code 2.
Error: Process completed with exit code 2.

Copy link
Author

Copilot AI commented Jan 21, 2026

@copilot Lint errors:

Run pnpm lint:tsc

...

Fixed in commit 6ba4605. Deleted the unused DefaultCpuOnly.tsx file that was left behind when we removed CPU-only UI from Main model settings.

Copy link
Owner

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works satisfactorily.

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

Add frontend UI for CPU-only model execution toggle

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
@lstein lstein force-pushed the copilot/sub-pr-46 branch from 5af4dad to 88228da Compare January 21, 2026 20:55
@lstein lstein changed the base branch from lstein/per-model-cpu-execution to main January 21, 2026 21:27
@lstein lstein changed the base branch from main to lstein/feature/text-encoder-cpu-toggle January 21, 2026 21:33
@lstein lstein merged commit 4ee2216 into lstein/feature/text-encoder-cpu-toggle Jan 21, 2026
1 check passed
@lstein lstein deleted the copilot/sub-pr-46 branch January 21, 2026 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants