Posts

配置	显存/内存需求	推荐模型版本
高端 GPU (RTX 4090等)	24GB+	完整版 (~32GB)
中端 GPU	8-16GB	Q4_K_S 量化版 (~4.7GB)
Mac (M1/M2/M3)	统一内存 16GB+	Q4_K_S 量化版
纯 CPU	32GB+ 内存	Q2_K 量化版 (~2.6GB)

我的环境是 Mac mini (M2 Pro, 16GB 内存)，所以选择了 Q4_K_S 量化版。

部署步骤

1. 环境准备

pip install torch diffusers transformers accelerate gguf>=0.10.0

2. 加载模型

import torch
from diffusers import ZImagePipeline, ZImageTransformer2DModel, GGUFQuantizationConfig
from huggingface_hub import hf_hub_download

device = "mps" if torch.backends.mps.is_available() else "cpu"
dtype = torch.float16 if device == "mps" else torch.float32

# 下载量化模型
gguf_path = hf_hub_download(
    repo_id="jayn7/Z-Image-Turbo-GGUF",
    filename="z_image_turbo-Q4_K_S.gguf"
)

# 加载量化配置
quant_config = GGUFQuantizationConfig(compute_dtype=dtype)

transformer = ZImageTransformer2DModel.from_single_file(
    gguf_path,
    quantization_config=quant_config,
    torch_dtype=dtype,
)

# 构建 pipeline
pipe = ZImagePipeline.from_pretrained(
    "Tongyi-MAI/Z-Image-Turbo",
    transformer=transformer,
    torch_dtype=dtype,
)
pipe.to(device)

3. 生成图片

image = pipe(
    prompt="一只可爱的柴犬，富士山背景，樱花飘落",
    height=512,  # Mac 建议 512，1024 可能内存不足
    width=512,
    num_inference_steps=4,  # 步数越少越快，4步约60秒
    guidance_scale=0.0,     # 必须设为 0
    generator=torch.Generator(device=device).manual_seed(42),
).images[0]

image.save("output.png")

关键参数调优

参数	说明	推荐值
`num_inference_steps`	推理步数	4-8 步（速度优先选4，质量优先选8）
`height/width`	图片尺寸	512x512 (Mac), 1024x1024 (GPU)
`guidance_scale`	引导系数	必须设为 0（Turbo 模型特性）
`seed`	随机种子	固定种子可复现结果

性能实测

在我的 Mac mini 上：

[]