Such an incredible essay (and book) and author. I highly recommend all of her essay writing. She captures the dark death of the California hippy era in a captivating and beautiful way.
Whenever I drive up the 5 leaving Los Angeles I get this strange sense of freedom and I think about Joan Didion and Philip Dick.
Among many memorable passages, I think I find this the most memorable:
> Right there you’ve got the ways that romanticism historically ends up in trouble, lends itself to authoritarianism. When the direction appears. How long do you think it’ll take for that to happen?” is a question a San Francisco psychiatrist asked me.
I gave up on that years ago. You can find guns that work but they're rare and crazy expensive. You can make an oil gun from a grease gun that works but that bitch will oil the machine and you and the floor and the wall.
I just swapped the fittings to get rid of the zerk nipples for something I could use with a standard oil fitting.
For machine tools I just use an oil can with a finer than normal tapered tip, this will depress the ball bearing in zerk / grease nipple fittings no problems, this also works with the ball oilers typically found on lathes etc. You can cut a tiny slit in the end too if that helps get oil in https://www.wentztech.com/metalworking/projects/convert-a-ch...
I'm biased by my preferred style of programming languages but I think that pure statically typed functional languages are incredibly well suited for LLMs. The purity gives you referential transparency and static analysis powers that the LLM can leverage to stay correctly on task.
The high level declarative nature and type driven development style of languages like Haskell also make it really easy for an experienced developer to review and validate the output of the LLM.
Early on in the GPT era I had really bad experiences generating Haskell code with LLMs but I think that the combination of improved models, increased context size, and agentic tooling has allowed LLMs to really take advantage of functional programming.
From what I've heard—and in my own very limited experiments—LLMs are much better at less popular languages than I would have expected. I've had good results with OCaml, and I've talked to people who've had good results with Haskell and even Unison.
I've also seen multiple startups that have had some pretty impressive performance with Lean and Rocq.
My current theory is that as long as the LLM has sufficiently good baseline performance in a language, the kind of scaffolding and tooling you can build around the pure code generation will have an outsize effect, and languages with expressive type systems have a pretty direct advantage there: types can constrain and give immediate feedback to your system, letting you iterate the LLM generation faster and at a higher level than you could otherwise.
I recently saw a paper[1] about using types to directly constrain LLM output. The paper used TypeScript, but it seems like the same approach would work well with other typed languages as well. Approaches like that make generating typed code with LLMs even more promising.
Abstract:
> Language models (LMs) can generate code but cannot guarantee its correctness often producing outputs that violate type safety, program invariants, or other semantic properties. Constrained decoding offers a solution by restricting generation to only produce programs that satisfy user-defined properties. However, existing methods are either limited to syntactic constraints or rely on brittle, ad hoc encodings of semantic properties over token sequences rather than program structure.
> We present ChopChop, the first programmable framework for constraining the output of LMs with respect to semantic properties. ChopChop introduces a principled way to construct constrained decoders based on analyzing the space of programs a prefix represents. It formulates this analysis as a realizability problem which is solved via coinduction, connecting token-level generation with structural reasoning over programs. We demonstrate ChopChop's generality by using it to enforce (1) equivalence to a reference program and (2) type safety. Across a range of models and tasks, ChopChop improves success rates while maintaining practical decoding latency.
As a huge proponent of constrained decoding for LLM reliability, I don't quite think it's the right approach for code. This is because in many programming languages, it is legal to use a function before its declaration. Since this is common in existing code, LLMs will also try to write code that way.
You might be right, but I think you must take into account that (I think) you're not super familiar with these languages as well, so you might not notice all the warts a programmer with a lot of experience in these langs would, and overrate the skill of the LLM.
Nowadays, I write C# and TS at work, and it's absolutely crazy how much better the LLM is at TS, with almost all code being decent the first try, but with C# I need to do a lot of massaging.
I write Haskell professionally and I can tell you that Opus 4.5 can do a great writing industrial Haskell code consistent with an existing code base.
I don't think it is capable of writing galaxy brain Haskell libraries, it absolutely missed the forest for the trees, but if you have an existing code base with consistent patterns it can emulate then it can do a surprisingly good job.
I built a library (without Claude) that wraps in an opinionated way Servant and a handful of other common libraries used to build Haskell web-apps and then have let Claude use that to build this site. There is absolutely some hairy code and I have done a ton of manual refactors on what Claude produces, but claude has been highly effective for me here.
You are right that there is significantly more Javascript in the training data, but I can say from experience that I'm a little shocked at how well opus 4.5 has been for me writing Haskell. I'm fairly particular and I end up re-writing a lot of code for style reasons but it can often one shot an acceptable solution that is mostly inline with the rest of the code base.
True for now, but probably not a durable fact. Synthetic data pipelines should be mostly invariant to the programming language, as long as the output is correct. If anything the additional static analysis makes it more amenable to synthetic data generation.
> Synthetic data pipelines should be mostly invariant to the programming language, as long as the output is correct.
Well, you can adapt your PHP producing pipeline to produce Haskell code that is correct in the sense of solving the problem at hand, but getting it to produce idiomatic code is probably a lot harder.
I think the trick with Haskell is that you can write types in such a way that the APIs that get generated are idiomatic and designed well. The implementations of individual functions might be messy or awkward, but as long as those functions are relatively small—which is how I tend to write my non-AI-based Haskell code anyhow!—it's not nearly as important.
I've also had decent experiences with Rust recently. I haven't done enough Haskell programming in the AI era to really say.
But it could be that different programming languages are a bit like different human languages for these models: when they have more than some threshold of training data, they can express their general problem solving skills in any of them? And then it's down to how much the compiler and linters can yell at them.
For Rust, I regularly tell them to make `clippy::pedantic` happy (and tell me explicitly when they think that the best way to do that is via an explicit ignore annotation in the code to disable a certain warning for a specific line).
Pedantic clippy is usually too.. pedantic for humans, but it seems to work reasonably well with the agents. You can also add clippy::cargo which ain't included in clippy::pedantic.
> But it could be that different programming languages are a bit like different human languages for these models: when they have more than some threshold of training data, they can express their general problem solving skills in any of them? And then it's down to how much the compiler and linters can yell at them.
Exactly my opinion - I think the more you lock down the "search space" by providing strong and opinionated tooling, the more LLMs perform well. I think of it as starting something like a simulated annealing trying to get a correct solution, versus the same simulated annealing run while using heuristics and bounds to narrow the solution space
It's not just your bias, I too have found great success with a functional programming style, even from the earliest days of ChatGPT. (Not Haskell, but JS, which the models were always good at.)
I think the underlying reason is that functional programming is very conducive to keeping the context tight and focused. For instance, most logic relevant to a task tends to be concentrated in a few functions and data structures across a smallish set of files. That's all you need to feed into the context.
Contrast that with say, Java, where the logic is often spread across a deep inheritance hierarchy located in bunch of separate files. Add to that large frameworks that encapsulate a whole lot of boilerplate and bespoke logic with magic being injected from arbitrary places via e.g. annotations. You'd need to load all of those files (or more likely, simply the whole codebase) and relevant documentation to get accurate results. And even then the additional context is not just extraneous and expensive, but also polluted with irrelevant data that actually reduces accuracy.
A common refrain of mine is that for the best results, you have to invest a lot of time experimenting AND adapt yourself to figure out what works best with AI. In my case, it was gradually shifting to a functional style after spending my whole career writting OO code.
We are a 501c(3) and are actively fundraising to build a tower here in Shadow Hills and are launching our live stream and regular schedule February 1st. So far we have about 60 shows in the schedule.
If you're in Los Angeles and have an interest in radio, please hit me up.
I hear you but I think you are simply asking for an entirely different blog post. I don't think Verity's aim here is to give an introduction to `Selective`, but rather to introduce a formalization for it; something which has been notably missing for those who think about these sorts of things.
I understand the original Selective Functor, so an introduction to that is not what I'm after. I want to understand this new formalization, because it's the kind of thing I use, but I'm not a theoretician. If the goal of this post is simply to explain the formalization to the small number of people who are already deep into (category) theory, I guess it does a fine job. However, I think a better post would be more accessible.
I think the blog post does a good job describing the idea of Selective ("finite-case" etc.) but for me it falls apart shortly afterwards. If I was writing it, from what I understood I would start with the overview, then describe `CaseTree`, and then go into what abstractions this is an instance of.
As a small example of how I think the writing could be improved, take this sentence:
"This is in contrast to applicative functors, which have no “arrow of time”: their structure can be dualized to run effects in reverse because it has no control flow required by the interface."
This uses jargon where it's not necessary. There is no need to mention duality, and the "arrow of time" isn't very helpful unless you've had some fairly specific education. I feel it's sufficient to say that applicatives don't represent any particular control-flow and therefore can be run in any order.
There are a bunch of these scifi world building short video channels now. IMO they all seem really creative initially but rapidly lose their luster and become repetitive.
Sora makes the hard parts easy and the easy parts hard. I don't think any of these content producers will be remembered in the future. :/
Years and years ago I became friends with someone who has started a series of companies and created at least one game with each of them (Web, mobile, mobile, crypto, web again). While watching him I learned the lesson that being an "idea guy" is worthless. It is all about execution. His ideas are great, in my opinion, though perhaps not unique. However, each success or failure, has come down to the execution. A couple of projects ran out of funding (Didn't execute fast enough). One was a flash game around the same time Apple stuck a knife in Flash, bad timing. Another was backed by a major publisher and was largely a success and the company was sold after 2 proven products shipped.
AI "democratizing" creativity is the biggest crock of lies. Everyone has ideas. Even people who aren't typically thought of as "creative". Ask anyone who watched the last season of Game of Thrones if they thought it could be better and I bet most of them will have "ideas" for how to make it better. Hell, the show runners had IDEAS. But the execution of season 8 was awful, and execution is where an "idea guy" becomes someone who created a product/story worth remembering.
LLMs remove the execution process, which is why they are so attractive to everyone who has ever had an idea and why they are abhorrent to nearly everyone who has ever executed on an idea. Lots of people thin execution is just busy work, but execution is also a major component of being creative.
Creativity is a series of small decisions over the course of the entire execution. To write a poem is not to have it fully-formed in your head. You go down and edit and see what turns up and what new interesting ideas come out of that.
I'm very sympathetic to this view, and it would be a nice counter to claims of AI creativity, but I'm not convinced this is the only way creativity can express itself. There are examples of strokes of genius, hence the term.
I suppose you could say such strokes of genius are the outcomes of a lifetime of creative work but that seems different from your example of editing a poem.
We are operating deep in the grey area here so I suppose that case could be made. Personally, I see creativity as more of a life long process which can express itself in a multitude of ways including strokes of genius and the daily iterative grind. I don't think any creative act occurs in a vacuum but I also think that there are moments where big things occur.
AI, as it stands, does not have a lifespan over which for creativity to occur.
The problem runs deeper: AI doesn't just remove execution, it replaces it with probabilistic averaging. Mastery of execution in film or code consists of thousands of micro-decisions, like light a bit to the left or pause a bit longer. Current diffusion models make these thousands of decisions for you based on what usually looks good. Democratization won't happen when the Make it Beautiful button works, but when we have tools to control these thousands of micro-decisions without needing to learn to draw pixels by hand. Right now we have randomization, not democratization
>While watching him I learned the lesson that being an "idea guy" is worthless. It is all about execution.
It's even trickier: execution is irrelevant too (if that means great execution, a polished well executed product). What matters is a works-well-enough for adoption product, plus luck, connections, funding (to continue existing and undercutting), marketing, and things like that.
More than half of my YouTube subscriptions are channels that once posted neat stuff and stopped - and most of them are from years ago. And before that i was into webcomics back when they were a popular online trend, i was following the updates of several of them and most of them have disappeared these days (and i don't even remember their names).
People stop doing things all the time, i'm not sure that means much.
Did you really interpret my statement as "AI content producers are being so creative and then quitting posting"?
I'm not talking about people posting good content and then stopping posting.
I guess its fair to say that "become repetitive" was unclear, what I should have said was "reveals itself to be repetitive."
I'm saying that these AI generated world building channels produce lots of content that looks creative and exciting at first but over time reveals itself to be repetitive and lacking in creativity.
But your argument could be made against anything novel. You love the first 3 seasons of Chopped or Hell’s Kitchen but eventually you figure out the repetitive story arc, you know how each show will unfold halfway through, same kinds on conflict, etc… the show either becomes background while folding clothes or you stop watching entirely. The novelty wears off-for better or worse.
I mean look at house hunters international. Every segment is the sssme. “I need an extra room for overseas family, I want to live by the beach, I need a roof deck but I teach ESL to blind monks and children. My budget is $400”. And then they’ll have some silly hang up about the reliability of elevators in general or maybe they absolutely can’t get over the east facing window. It’s 100% formulatic yet perfect as background.
I dunno where I’m going with this but those AI videos you say you liked… the novelty wore off.
Wow I wish they had announced this sooner. I just ordered a keyphone but this looks way more suited to my use case. I just want a basic feature phone + qwerty keyboard + signal + whats app.
I've been using a lightphone for 3 years but i can't stand the touch screen and only having SMS is annoying.
What do you use for maps? Or paying for parking which maybe isn't the case for you but in my city requires use of a smartphone app. What about music and podcasts? Asking cause I would like to use a dumb phone if possible but it seems like it would actually introduce a lot of friction into daily life.
That tend to hold "on average" for a population but often don't hold for the individual within a population. This is the ecological fallacy [0], just one of the fallacies underlying psychiatry.
My argument isn't that psychiatric symptoms don't exist or aren't real and there is no real underlying phenomenon. My argument is simply that we've drawn the lines between the units of study too high up and we should be more granular. This level of nosology was chosen in 1952. Do you really think they got it 100% right almost 75 years ago? And what is the mechanism for defining and maintaining these categories? A bunch of committees get together every few years and decide on them, then they tell us all what's "true". Bullshit. What are the odds that a committee will define itself out of existence? Pretty slim. [1]
I have traits that could be considered as autism, ADHD, obsessive compulsive personality disorder, PTSD, bipolar II, social anxiety disorder, and probably a dozen more disorders. But by quantizing the disorder at the current level, by necessity, the other traits are cropped out of view. Relevant information is lost and irrelevant information is blurred together. And the level of overlap between disorders is absurd. They cannot possibly be "real" because the lines between them aren't even distinct.
The useful unit to study is the individual trait, not the cluster of traits that is different in each individual. The traits are more granular and map more closely map to underlying biology anyway. The current model is akin to what the geocentric model was in astronomy. It's outdated, wrong, and holding us back from a more accurate, detailed view.
> My argument is simply that we've drawn the lines between the units of study too high up and we should be more granular.
I agree with this, and your overall post. I’ll just add that if the purpose is treatment, it helps to find root causes, and maybe there’s a common thread in the underlying root causes, likely related to gene expression.
reply