A model "where expert switches are less necessary" is hard to tell apart from a model that just has fewer total experts. I'm not sure whether that will be a good approach. "How often to switch" also depends on how much excess RAM has been available in the system to keep layers opportunistically cached from the previous token(s). There's no one-size fits all decision.
Very interesting! On what platforms can this run? If it can run on iOS, how would you handle attempts to access to the file system or networking, is this already wired in somehow? If not is it easy to add custom handlers to handle these actions?
It is definitely not foolproof but IMHO, to some extent, it is easier to describe what you expect to see than to implement it so I don't find it unreasonable to think it might provide some advantages in terms of correctness.
In my experience, this tends to be more related to instrumentation / architecture than a lack of ability to describe correct results. TDD is often suggested as a solution.
It feel a bit like this to me. That's not to say LLMs should not have detected this, but I still feel like this fits the "vibes" the question gives, and some LLMs fall into that trap. Is it actually what's happening in the neural nets? Maybe not! But I always find it interesting or at least entertaining to approach those questions that way nonetheless; especially given the pattern matching nature of LLMs.
The thing is that there is some overlap between trick questions and questions where the human is genuinely making a mistake themselves and where it would make sense for the model to step back and at least ask for clarification.
This is so elegant, especially with the art lights! To me, the desirable future for connected homes is one where technology is everywhere but mostly hidden and this is such a good example! This feels like an upgraded version of a chalkboard or sticky notes, but quite an optimal upgrade: one that makes it more useful and dynamic while keeping it mostly unintrusive.
On the subject of dedicated home control dashboards, I'm not sure I see their value at all given we all have screens in our pockets, so when it comes to enabling interactive controls I feel like using your existing devices or voice controls is the right approach.
I'm glad you asked because I must admit that in the last few weeks I totally thought this was just another agentic harness that happened to have a lot of extensions + ways to talk to it through messaging apps. So does this mean OpenClaw can connect to any agent? In that case I don't understand this part of the docs:
> Legacy Claude, Codex, Gemini, and Opencode paths have been removed. Pi is the only coding agent path.
reply