{"key":"project_arena_multi_agent_refactor_2026_03_29","title":"Arena CLI project status and plan for generic Codex/Gemini/Claude support","content":"Project: Arena CLI\nPrimary path: /home/svc-admin/ai-projects/projects/arena\nPurpose: local CLI that makes AI CLIs talk to each other live in the terminal with clean speaker-only output.\n\nCurrent implementation status as of 2026-03-29:\n- Arena is built and installed.\n- Command is available via ~/.local/bin/arena\n- Project-local venv exists at /home/svc-admin/ai-projects/projects/arena/.venv\n- Current implementation supports exactly 2 hard-coded agents: Codex and Gemini.\n- End-to-end smoke test succeeded with live output.\n\nImportant files:\n- /home/svc-admin/ai-projects/projects/arena/README.md\n- /home/svc-admin/ai-projects/projects/arena/pyproject.toml\n- /home/svc-admin/ai-projects/projects/arena/scripts/install.sh\n- /home/svc-admin/ai-projects/projects/arena/src/arena/__init__.py\n- /home/svc-admin/ai-projects/projects/arena/src/arena/clean.py\n- /home/svc-admin/ai-projects/projects/arena/src/arena/agents.py\n- /home/svc-admin/ai-projects/projects/arena/src/arena/cli.py\n\nCurrent behavior:\n- User runs: arena \"<topic>\"\n- Arena alternates between codex and gemini for N turns.\n- It prints only clean lines like:\n  Codex: ...\n  Gemini: ...\n- Optional JSONL and text transcript output already exist.\n\nCurrent CLI flags:\n- positional topic\n- --turns\n- --first {codex,gemini}\n- --delay\n- --timeout\n- --codex-role\n- --gemini-role\n- --codex-model\n- --gemini-model\n- --codex-arg (repeatable)\n- --gemini-arg (repeatable)\n- --log\n- --transcript\n- --prefix-topic\n\nImplementation notes:\n1. clean.py\n- Contains output cleaning rules for Codex and Gemini.\n- Codex cleanup strips banner, model/provider/session lines, token footer, bubblewrap warning, and prompt echo.\n- Gemini cleanup strips keychain warnings and MCP refresh chatter.\n\n2. agents.py\n- Contains:\n  - ArenaError\n  - AgentReply dataclass\n  - ask_codex(...)\n  - ask_gemini(...)\n- Codex call uses: codex exec --skip-git-repo-check --color never -o <tempfile> ...\n- Gemini call uses: gemini -p <prompt>\n- Both support model override and raw extra args.\n\n3. cli.py\n- Current turn loop is pair-specific and not generic.\n- Prompts are assembled from:\n  - per-agent role instruction\n  - shared topic\n  - conversation so far\n  - direct prompt to reply as a specific speaker\n- The loop rotates between exactly codex and gemini.\n\nWhat user wants next:\n- Future version should support any combination of Codex, Gemini, and Claude.\n- User explicitly said they want support for any combination of Codex, Gemini, Claude, not just bolt-on Claude support.\n\nRecommended design for next session:\nRefactor Arena into a generic agent registry instead of hard-coded codex/gemini flow.\n\nTarget architecture:\n1. Introduce a generic agent definition layer.\n- Add an agent registry, either in agents.py or a new file such as src/arena/registry.py\n- Each agent entry should define:\n  - display name (Codex, Gemini, Claude)\n  - command runner function\n  - output cleaner\n  - optional model flag format\n  - extra arg passthrough handling\n  - default role text\n\n2. Replace fixed 2-agent CLI shape with selected-agent list.\nSuggested new flag:\n- --agents codex,gemini\n- --agents codex,claude\n- --agents gemini,claude\n- --agents codex,gemini,claude\nDefault can stay codex,gemini for backward compatibility.\n\n3. Replace pair-specific flags with per-agent generic options.\nPossible approach:\n- keep backward compatibility for current flags first\n- add new generic options for future scale:\n  - --agents codex,gemini,claude\n  - --role codex=\"...\"\n  - --role gemini=\"...\"\n  - --role claude=\"...\"\n  - --model codex=gpt-5.4\n  - --model gemini=gemini-2.5-pro\n  - --model claude=...\n  - --agent-arg codex=--foo\n  - --agent-arg gemini=--bar\n  - --agent-arg claude=--baz\nAlternative simpler v1 for refactor:\n- add explicit flags:\n  - --claude-role\n  - --claude-model\n  - --claude-arg\n- plus --agents list\nThat is less elegant but easier to implement quickly.\n\n4. Turn loop refactor.\n- Instead of toggling between two speakers, rotate through an ordered list of selected agents.\n- Example for --agents codex,gemini,claude with --turns 6:\n  - turn 1 codex\n  - turn 2 gemini\n  - turn 3 claude\n  - turn 4 codex\n  - turn 5 gemini\n  - turn 6 claude\n- Keep transcript format the same:\n  Speaker: text\n\n5. Prompt assembly.\n- Keep current prompt pattern.\n- For each agent call, include:\n  - that agent's role instruction\n  - shared topic\n  - full conversation so far\n  - explicit instruction like: Reply now as Claude.\n- Continue to keep replies short by default.\n\n6. Claude support prerequisites.\nBefore implementing Claude runner, verify there is an actual working Claude CLI on this machine and determine:\n- executable name\n- non-interactive prompt invocation format\n- model flag format\n- common noise that needs to be stripped\nIf Claude CLI is not installed, do the generic refactor first and leave Claude runner behind a missing-command error.\n\nSuggested code changes next session:\n- Update /home/svc-admin/ai-projects/projects/arena/src/arena/agents.py\n  Add generic registry and eventually ask_claude(...)\n- Update /home/svc-admin/ai-projects/projects/arena/src/arena/clean.py\n  Add clean_claude_output(...) when CLI details are known\n- Update /home/svc-admin/ai-projects/projects/arena/src/arena/cli.py\n  Replace fixed codex/gemini loop with selected-agent rotation\n- Update /home/svc-admin/ai-projects/projects/arena/README.md\n  Document --agents and Claude support\n- Optionally add tests if a test directory is introduced later; currently project has no tests.\n\nBackward compatibility recommendation:\n- Preserve current usage: arena \"topic\"\n- Preserve current codex/gemini flags for now.\n- Default --agents should be codex,gemini.\n- If user specifies claude in --agents and Claude CLI is unavailable, exit with a clear error.\n\nOperational notes:\n- Existing install flow uses scripts/install.sh\n- Installer creates .venv, installs editable package, and symlinks ~/.local/bin/arena\n- ~/.bashrc already contains: export PATH=\"$HOME/.local/bin:$PATH\"\n- arena --help works in a fresh sourced shell\n\nVerified behavior from this session:\n- arena live run worked with:\n  PATH=/home/svc-admin/.local/bin:$PATH arena --turns 2 --delay 0 --prefix-topic \"Debate the best local-first homelab mobile app\"\n- Output was clean and speaker-only.\n\nRecommended next-session execution plan:\n1. Inspect whether Claude CLI exists on machine.\n2. If present, determine non-interactive invocation and output shape.\n3. Refactor current two-agent logic into generic selected-agent rotation.\n4. Add Claude runner and cleaner.\n5. Update README examples to show:\n   - --agents codex,gemini\n   - --agents codex,claude\n   - --agents codex,gemini,claude\n6. Smoke-test at least:\n   - codex+gemini\n   - codex+gemini+claude if Claude is available\n\nKnown limitation to preserve in docs:\n- Arena is live per completed turn, not token-streaming.\n- Output cleaners may need maintenance if upstream CLIs change banner/noise format.\n\nThis memory is intended to let a future Codex session pick up the Arena project quickly and implement generic multi-agent support without rediscovering the current layout.","summary":"Project: Arena CLI\nPrimary path: /home/svc-admin/ai-projects/projects/arena\nPurpose: local CLI that makes AI CLIs talk to each other live in the terminal with clean speaker-only output.\n\nCurrent implementation status as of 2026-03-29:\n- Arena is built and installed.\n- Command is available via ~/.local/bin/arena\n- Project-local venv exists at /home/svc-admin/ai-projects/projects/arena/.venv\n- Current implementation supports exactly 2 hard-coded agents: Codex and Gemini.\n- End-to-end smoke test succeeded with live output.\n\nImportant files:\n- /home/svc-admin/ai-projects/projects/arena/README.md\n- /home/svc-admin/ai-projects/projects/arena/pyproject.toml\n- /home/svc-admin/ai-projects/projects/arena/scripts/install.sh\n- /home/svc-admin/ai-projects/projects/arena/src/arena/__init__.py\n- /home/svc-admin/ai-projects/projects/arena/src/arena/clean.py\n- /home/svc-admin/ai-projects/projects/arena/src/arena/agents.py\n- /home/svc-admin/ai-projects/projects/arena/src/arena/cli.py\n\nCurrent behavior:\n- User runs: arena \"<topic>\"\n- Arena alternates between codex and gemini for N turns.\n- It prints only clean lines like:\n  Codex: ...\n  Gemini: ...\n- Optional JSONL and text transcript output already exist.\n\nCurrent CLI flags:\n- positional topic\n- --turns\n- --first {codex,gemini}\n- --delay\n- --timeout\n- --codex-role\n- --gemini-role\n- --codex-model\n- --gemini-model\n- --codex-arg (repeatable)\n- --gemini-arg (repeatable)\n- --log\n- --transcript\n- --prefix-topic\n\nImplementation notes:\n1. clean.py\n- Contains output cleaning rules for Codex and Gemini.\n- Codex cleanup strips banner, model/provider/session lines, token footer, bubblewrap warning, and prompt echo.\n- Gemini cleanup strips keychain warnings and MCP refresh chatter.\n\n2. agents.py\n- Contains:\n  - ArenaError\n  - AgentReply dataclass\n  - ask_codex(...)\n  - ask_gemini(...)\n- Codex call uses: codex exec --skip-git-repo-check --color never -o <tempfile> ...\n- Gemini call uses: gemini -p <prompt>\n- Both support model override and raw extra args.\n\n3. cli.py\n- Current turn loop is pair-specific and not generic.\n- Prompts are assembled from:\n  - per-agent role instruction\n  - shared topic\n  - conversation so far\n  - direct prompt to reply as a specific speaker\n- The loop rotates between exactly codex and gemini.\n\nWhat user wants next:\n- Future version should support any combination of Codex, Gemini, and Claude.\n- User explicitly said they want support for any combination of Codex, Gemini, Claude, not just bolt-on Claude support.\n\nRecommended design for next session:\nRefactor Arena into a generic agent registry instead of hard-coded codex/gemini flow.\n\nTarget architecture:\n1. Introduce a generic agent definition layer.\n- Add an agent registry, either in agents.py or a new file such as src/arena/registry.py\n- Each agent entry should define:\n  - display name (Codex, Gemini, Claude)\n  - command runner function\n  - output cleaner\n  - optional model flag format\n  - extra arg passthrough handling\n  - default role text\n\n2. Replace fixed 2-agent CLI shape with selected-agent list.\nSuggested new flag:\n- --agents codex,gemini\n- --agents codex,claude\n- --agents gemini,claude\n- --agents codex,gemini,claude\nDefault can stay codex,gemini for backward compatibility.\n\n3. Replace pair-specific flags with per-agent generic options.\nPossible approach:\n- keep backward compatibility for current flags first\n- add new generic options for future scale:\n  - --agents codex,gemini,claude\n  - --role codex=\"...\"\n  - --role gemini=\"...\"\n  - --role claude=\"...\"\n  - --model codex=gpt-5.4\n  - --model gemini=gemini-2.5-pro\n  - --model claude=...\n  - --agent-arg codex=--foo\n  - --agent-arg gemini=--bar\n  - --agent-arg claude=--baz\nAlternative simpler v1 for refactor:\n- add explicit flags:\n  - --claude-role\n  - --claude-model\n  - --claude-arg\n- plus --agents list\nThat is less elegant but easier to implement quickly.\n\n4. Turn loop refactor.\n- Instead of toggling between two speakers, rotate through an ordered list of selected agents.\n- Example for --agents codex,gemini,claude with --turns 6:\n  - turn 1 codex\n  - turn 2 gemini\n  - turn 3 claude\n  - turn 4 codex\n  - turn 5 gemini\n  - turn 6 claude\n- Keep transcript format the same:\n  Speaker: text\n\n5. Prompt assembly.\n- Keep current prompt pattern.\n- For each agent call, include:\n  - that agent's role instruction\n  - shared topic\n  - full conversation so far\n  - explicit instruction like: Reply now as Claude.\n- Continue to keep replies short by default.\n\n6. Claude support prerequisites.\nBefore implementing Claude runner, verify there is an actual working Claude CLI on this machine and determine:\n- executable name\n- non-interactive prompt invocation format\n- model flag format\n- common noise that needs to be stripped\nIf Claude CLI is not installed, do the generic refactor first and leave Claude runner behind a missing-command error.\n\nSuggested code changes next session:\n- Update /home/svc-admin/ai-projects/projects/arena/src/arena/agents.py\n  Add generic registry and eventually ask_claude(...)\n- Update /home/svc-admin/ai-projects/projects/arena/src/arena/clean.py\n  Add clean_claude_output(...) when CLI details are known\n- Update /home/svc-admin/ai-projects/projects/arena/src/arena/cli.py\n  Replace fixed codex/gemini loop with selected-agent rotation\n- Update /home/svc-admin/ai-projects/projects/arena/README.md\n  Document --agents and Claude support\n- Optionally add tests if a test directory is introduced later; currently project has no tests.\n\nBackward compatibility recommendation:\n- Preserve current usage: arena \"topic\"\n- Preserve current codex/gemini flags for now.\n- Default --agents should be codex,gemini.\n- If user specifies claude in --agents and Claude CLI is unavailable, exit with a clear error.\n\nOperational notes:\n- Existing install flow uses scripts/install.sh\n- Installer creates .venv, installs editable package, and symlinks ~/.local/bin/arena\n- ~/.bashrc already contains: export PATH=\"$HOME/.local/bin:$PATH\"\n- arena --help works in a fresh sourced shell\n\nVerified behavior from this session:\n- arena live run worked with:\n  PATH=/home/svc-admin/.local/bin:$PATH arena --turns 2 --delay 0 --prefix-topic \"Debate the best local-first homelab mobile app\"\n- Output was clean and speaker-only.\n\nRecommended next-session execution plan:\n1. Inspect whether Claude CLI exists on machine.\n2. If present, determine non-interactive invocation and output shape.\n3. Refactor current two-agent logic into generic selected-agent rotation.\n4. Add Claude runner and cleaner.\n5. Update README examples to show:\n   - --agents codex,gemini\n   - --agents codex,claude\n   - --agents codex,gemini,claude\n6. Smoke-test at least:\n   - codex+gemini\n   - codex+gemini+claude if Claude is available\n\nKnown limitation to preserve in docs:\n- Arena is live per completed turn, not token-streaming.\n- Output cleaners may need maintenance if upstream CLIs change banner/noise format.\n\nThis memory is intended to let a future Codex session pick up the Arena project quickly and implement generic multi-agent support without rediscovering the current layout.","status":"active","namespace":"projects","namespace_name":"projects","namespace_tier":"shared","tags":[]}