Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Good to see competition for Codex. I think cloud-based async agents like Codex and Jules are superior to the Claude Code/Aider/Cursor style of local integration. It's much safer to have them completely isolated from your own machine, and the loop of sending them commands, doing your own thing on your PC and then checking back whenever is way better than having to set up git worktrees or any other type of sandbox yourself


Codex/Jules are taking a very different approach than CC/Curser,

There used to be this thesis in software of [Cathedral vs Bazaar](https://en.wikipedia.org/wiki/The_Cathedral_and_the_Bazaar), the modern version of it is you either 1) build your own cathedral, and you bring the user to your house. It is a more controlled environment, deployment is easier, but also the upside is more limited and also shows the model can't perform out-of-distribution. OpenAI has taken this approach for all of its agentic offering, whether ChatGPT Agent or Codex.

2) the alternative is Bazaar, where you bring the agent to the user, and let it interact with 1000 different apps/things/variables in their environment. It is 100x more difficult to pull this off, and you need better model that are more adaptable. But payoff is higher. The issues that you raised (env setup/config/etc) are temporary and fixable.


This is the actual essence of CATB, has very little to with your analogy:

-----

> The software essay contrasts two different free software development models:

> The cathedral model, in which source code is available with each software release, but code developed between releases is restricted to an exclusive group of software developers. GNU Emacs and GCC were presented as examples.

> The bazaar model, in which the code is developed over the Internet in view of the public. Raymond credits Linus Torvalds, leader of the Linux kernel project, as the inventor of this process. Raymond also provides anecdotal accounts of his own implementation of this model for the Fetchmail project

-----

Source: Wikipedia


While the GP is completely off-base with their analogy, the Wikipedia summary is so simplified to the point of missing all the arguments made in the original essay.

If you're a software developer and especially if you're doing open source, CATB is still worth a read today. It's free on the author's website: http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral...

From the introduction:

>No quiet, reverent cathedral-building here—rather, the Linux community seemed to resemble a great babbling bazaar of differing agendas and approaches (aptly symbolized by the Linux archive sites, who'd take submissions from anyone) out of which a coherent and stable system could seemingly emerge only by a succession of miracles.

> The fact that this bazaar style seemed to work, and work well, came as a distinct shock. As I learned my way around, I worked hard not just at individual projects, but also at trying to understand why the Linux world not only didn't fly apart in confusion but seemed to go from strength to strength at a speed barely imaginable to cathedral-builders.

It then goes on to analyze why this worked at all, and if the successful bazaar-style model can be replicated (it can).


Cursor now has “Background Agents” which do the same thing as Codex/Jules.


CATB was about how to organize people to tackle major community/collaborative efforts in a social system that is basically anarchy.

Both situations you've described are Cathedrals in the CATB sense: all dev costs are centralized and communities are impoverished by repeating the same dev work over and over and over and over.


Can you elaborate on how Codex vs. CC maps onto this cathedral vs. bazaar dichotomy? They seem fairly similar to me.


of course,

cathedral = sandbox env in the provider's cloud, so [codex](https://chatgpt.com/codex) uses this model. Their codex-cli product is the Bazaar model, where you run in your computer, in your own environment.

Claude Code, on the other hand, doesn't have the cloud-based sandboxing product, you have to run in on your computer, so the bazaar model. You can also run in in a way that anthropic never envisioned (e.g. give it control to your house). Curser also follows the same model, albeit they have been trying to get into the cathedral model by using the background agent (as someone also pointed out below). Presumably not to lose the market share to codex/jules/etc.


Claude Code does have remote sandboxing, and it’s better & more enterprise ready than any of these alternatives.

Can deploy as a github action right now.

Tag it in any new issue, pr, etc.

Future history will highlight Claude Code as the first true form agent. These other analogies are not intuitive enough for the evolution of an os-native agent into eventual ai robotics.


It's safer have them completely isolated, but it's slower and more expensive.

Sometimes I just realize that CC going nuts and stop it before it goes too far (and consume too much). With this async setup, you may come after a couple of hours and see utter madness(and millions of tokens burned).


Completely agree. I also want to tightly control the output, and the more it just burns and burns the more i become overwhelmed by a giant pile of work to review.

A tight feedback loop is best for me. The opposite of these async models. At least for now.


You need a supervisor agent to periodically check the progress and `if (madness) halt(1)`


I think the Github-PR model for agent code suggestions is the path of least resistance for getting adoption from today's developers working in an existing codebase. It makes sense: these developers are already used to the idea and the ergonomics of doing code reviews this way.

But pushing this existing process - which was designed for limited participation of scarce people - onto a use-case of managing a potentially huge reservoir of agent suggestions is going to get brittle quickly. Basically more suggestions require a more streamlined and scriptable review workflow.

Which is why I think working in the command line with your agents - similar to Claude and Aider - is going to be where human maintainers can most leverage the deep scalability of async and parallel agents.

> is way better than having to set up git worktrees or any other type of sandbox yourself

I've built up a helper library that does this for you for either aider or claude here: https://github.com/sutt/agro. And for FOSS purposes, I want to prevent MS, OpenAI, etc from controlling the means of production for software where you need to use their infra for sandboxing your dev environment.

And I've been writing about how to use CLI tricks to review the outputs on some case studies as well: https://github.com/sutt/agro/blob/master/docs/case-studies/i...


FWIW you can run Claude code async via GitHub actions and have it work on issues that you @ mention it from - there’s even a slash command in Claude code that will automatically set up your repository with the GitHub action config to do this


I agree but I just love codex-1 model that is powering codex and see pro 2.5 as inferior.

It's interesting that most people seem to prefer local code, I love that it allows me to code from my mobile phone while on the road.


What kind of things are you coding while “on the road”? Phone addiction aside, the UX of tapping prompts into my phone and either collaborating with an agent, or waiting for a background agent to do its thing, is not very appealing.


Mainly thinking about what are the minimum testable changes that I can give to codex to work on the background.

Tapping the prompts in is the easy part, but async model is different to work with, I feel more like a manager, not a co-developer.


hey I am exactly the same, is there a way to reach out to you? I would love to chat more about mobile coding


I also just got an email tonight for early access to try CC in the browser. "Submit coding tasks from the web." "Pick up where Claude left off by teleporting tasks to your terminal" I'm most interested to see how the mobile web UI/UX is. I frequently will kick something off, have to handle something with my toddler, and wish I could check up on or nudge it quickly from my phone.


Getting the environment set up in the cloud is a pain vs just running in your environment imo. I think we’ll probably see both for the foreseeable future but I am betting on the worse-is-better of cli tools and ide integrations winning over the next 2 years.


It took me like half an afternoon to get set up for my workplace's monorepo, but our stack is pretty much just Python and MongoDB so I guess that's easier. I agree, it's a significant trade-off, it just enables a very convenient workflow once it's done, and stuff like having it make 4 different versions with no speed loss is mind-blowing.

One nice perk on the ChatGPT Team and Enterprise plans is that Codex environments can be shared, so my work setting this up saved my coworkers a bunch of time. I pretty much just showed how it worked to my buddy and he got going instantly


It’s surprisingly good. If you try Copilot in GitHub, it has had no issues setting up temporary environments every single time in my case.

No special environment instructions required.


It has depended heavily on the project. New SPA for the web? No problem. Nontrivial application with three services each with their own container in a monorepo? ML inference code that requires cuda hardware? No chance.


with something like github copilot coding agent it's really not, the environment setup is just like github actions


> I think cloud-based async agents like Codex and Jules are superior to the Claude Code/Aider/Cursor style of local integration

Ideally, a combination of both I feel like would be a productive setup. I prefer the UI of Codex where I can hand-off boring stuff while I work on other things, because the machines they run Codex on is just too damn slow, compiling Rust takes forever and it needs to continuously refetch/recompile dependencies instead of leveraging caching, compared to my local machine.

If I could have a UI + tools + state locally while the LLM inference is the only remote point, the whole workflow would end up so much faster.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: