AIGC · Architecture · Human-in-the-loop

Semi-agentic Architectural Image Iteration Workflow

Nanobanana 半自动建筑工作流

2026

01｜项目概述 Project Summary

一个围绕建筑 AI 效果图多轮生成场景搭建的半自动工作流原型，用于串联白模输入、prompt 装配、图像生成、feedback 承接与 round/state 状态管理。

它不是一个成熟生产工具，而是一次针对"建筑 AI 生图多轮修改流程"的 workflow prototype：验证哪些环节可以被自动串联，哪些判断仍然需要人工保留。

This project is a semi-agentic architectural image iteration workflow prototype. It connects white-model input, prompt assembly, image generation, structured feedback handling, and round-based state management, aiming to reduce repetitive manual operations in multi-round AI rendering workflows.

02｜为什么做这个 Why I Built This

在建筑 AI 生图过程中，每一轮都需要重复组织白模图、参考图、项目背景、修改意见和 prompt。多轮修改时，上一轮图像、feedback、输出结果和文件路径也容易分散在聊天记录、文件夹和命令行之间。

我希望验证：如果把"输入 → prompt 装配 → 出图 → 反馈 → 下一轮"这段流程结构化，是否能减少重复操作，并让每一轮修改更可追踪。

03｜工作流怎么设计 Workflow Design

白模主图输入
↓
Stage 1｜白模锁几何
锁定体量 / 建筑数量 / 主机位
↓
Stage 2｜反馈推进
基于上一轮图像 + feedback case 继续生成
↓
Round / State 管理
记录 project_id / round / prompt / feedback / output / history

04｜我做了什么 My Contributions

需求拆解与边界定义

Requirement Decomposition & Scope Definition

拆解建筑 AI 生图多轮流程中的重复操作，明确哪些环节适合自动化，哪些判断需要人工保留。
I broke down the repeated operations in multi-round architectural AI image generation, and clarified which parts could be automated and which parts should remain human-led, especially visual judgment and design decisions.

两阶段生成策略设计

Two-stage Generation Strategy

设计 Stage 1 / Stage 2 两阶段策略：先用白模锁定体量和主机位，再基于上一轮图像进行反馈推进。
I designed a two-stage generation strategy: Stage 1 uses the white model as the main anchor to lock massing and camera logic, while Stage 2 uses the previous output as the anchor for feedback-driven refinement.

结构化 feedback 设计

Structured Feedback Design

设计 keep / fix / avoid 结构化 feedback 格式，用于表达"保留什么、修改什么、避免什么"。
I designed a structured feedback format based on keep / fix / avoid, making each revision request clearer by separating what should be preserved, what should be improved, and what should be avoided.

本地工作台 MVP 搭建

Local Workflow UI MVP

使用 Claude Code 搭建本地 Streamlit 工作台，实现上传图片、选择阶段、运行生成、继续下一轮、前后对比和历史轮次查看。
I built a lightweight local Streamlit workspace that supports image upload, stage selection, generation runs, next-round continuation, before/after comparison, and historical round review.

方法论验证与边界记录

Validation & Boundary Reflection

通过 case_04a / case_04b 对照测试，验证不同强度 feedback 对环境完成度推进的影响，并记录项目边界：该工具适合结构化执行串联，不适合替代开放式审美判断。
Through the case_04a / case_04b comparison test, I validated how different feedback intensities affect environmental-completion refinement, and documented the prototype's boundary: it is useful for structured execution and stateful workflow tracking, but not intended to replace open-ended visual judgment or creative dialogue with strong image models.

05｜工作台 MVP 展示 MVP Demo

本地工作台 MVP

我将原本需要通过命令行执行的流程，制作成了一个本地网页工作台。用户可以上传白模图或参考图，选择 stage1 / stage2 路线，运行生成，并在下一轮选择标准 feedback case 或手动输入修改意见。

06｜方法论验证 Validation: case_04a vs case_04b

这一组测试不是为了证明某个 prompt 一定能生成更好的图，而是为了验证：当同一张图作为起点时，不同强度的 feedback 是否会带来可观察的推进差异。

测试设计

为了验证"环境完成度"反馈是否需要不同强度档位，我将原 case_04 拆分为：

case_04a｜保守版：轻微补充环境，尽量不洗图
case_04b｜推进版：更明确增加前景、人物、车辆、地面细节和积水反射

两组测试使用同一个 Round 2 图像作为起点，唯一变量是 feedback 强度。

测试结论

case_04b 能在保持建筑体量、主机位和主体材质关系稳定的前提下，明显提升环境完成度。
case_04a 确实更保守，但推进效果不足，局部环境信息甚至可能回退。
这说明 feedback 的"强度"本身是一个重要变量：过弱的反馈可能导致模型几乎不做改变，而更明确的反馈更适合推进中间图到最终汇报图状态。

07｜反思与边界 Reflection & Boundary

反思

这个原型并不一定比直接与强图像模型对话更高效。对于高审美、高语境、高模糊度的图像探索，开放式人机对话仍然更灵活。

它的价值不在于替代人工审美判断，而在于验证：当修改意图可以结构化表达时，系统能否自动承接上下文、管理轮次、记录反馈，并提供可追踪的多轮执行机制。

因此，我更愿意把它定义为一个 AI workflow prototype，而不是一个成熟生产工具。

This prototype is not intended to replace open-ended visual dialogue with strong image models. Its value lies in workflow abstraction, state tracking, and controlled feedback validation.