Hacker Newsnew | past | comments | ask | show | jobs | submit | sundarurfriend's commentslogin

It's hard to decouple them as primary vs secondary because Julia is pretty central to what they're doing here. To my understanding, all the actual calculations that this is based on are in Julia, Dyad is basically a layer above it that gives a declarative interface, AI integration that understands that interface, and a GUI that makes it even easier (than the declarative language) to input the model. So funding for Dyad has pretty heavy incentives to go towards improving the Julia ecosystem because that's where its foundations are.

The paywalled (or subscription-walled) portion isn't too long, it's a pretty small article, but here it is: https://removepaywalls.com/https://www.axios.com/2026/04/30/...

Does anyone have a general idea of if $65m is typical, or larger or smaller than the usual funding amounts for these kinds of industry targeted "boring" software?

Despite the framing, I think Dyad's role is more to fill in the areas where Simulink is a pain to use and has been wrangled into shape for lack of better options, than to replace it. The agentic part can be a big pull though, if they can get it to reliably produce what the user, eg. the engineer, asked for, without having to spend more time correcting it than they'd have spent writing or laying it out. Seems plausible because this is a specialized niche-purpose AI, but still not 100% certain it can get there IMO.


I don't know the actual Christian theology, but at least in modern popular interpretations, Lucifer is the Angel of Independence, so that would suggest no!

Scarcity: Why Having Too Little Means So Much is a pretty good book about the psychology of this. The stronger your the necessity for saving (whether from poverty or external influence like here), the deeper it gets embedded in your psyche as well, and can start to feel like "this is just who I am" as in habits around this becoming something you see as intrinsic to your personality.

As an English-as-second-language speaker and writer, one thing Grok really shines at is capturing the tone and level of "formality" of a piece of text and the replicating it correctly. It seems to understand the little human subtleties of language in a way the other major providers don't. Chatgpt goes overly stiff and formal sounding, or ends up in a weird "aye guvnor" type informal language (Claude is sometimes better but not always).

Grok seems in general better at being "human" in ways that are hard to define: for eg. if I ask it "does this message roughly convey things correctly, to the level it can given this length", it will likely answer like a human would (either a yes or a change suggestion that sticks to the tone and length), while Chatgpt would write a dissertation on the message that still doesn't clear anything up.

Recently I've noticed that Grok seems to have gotten really good at dictation too (that feature where you click the mic to ask it something). Chatgpt has like 90-95% accuracy with my accent, the speech input on Android's Gboard something like 75%, Grok surprisingly gets something like 98% of my words correct.


I did a quick eval comparing Grok 4.3, Opus 4.7 and GPT 4.1 and they actually seem pretty similar:

https://ofw640g9re.evvl.io/

They all did pretty well at a more "formal" tone, but GPT4.1 was the only one that didn't make me cringe with a "casual" tone.

[edit] fwiw, grok was also the fastest+cheapest model, claude was slowest and priciest.


This is the most basic level of eval, of whether they can produce output that will be considered by someone somewhere (usually a young urban US American) as informal toned. Real human communication is far more nuanced than this, different groups have different linguistic registers they're used to and things outside it sound odd even if they can't articulate why. You could also want to be informal but not over-familiar with the other person (for eg. in a discord chat to a new acquaintance) - actually looking at the outputs here, the Claude output seems best fitting for that (in my subjective view anyway) than to the one you gave it - or want many other little variations.

What makes one cringe and another recognize as familiar and comfortable is also pretty subtle and hard to define. These things need nuanced descriptions and examples to actually get right, and it's in understanding those nuances and figuring out the register of the examples that Grok outshines the others.


you said that English is not your first language, so heads up - you don't need "for" when you use "e.g.", it already means "for example".

You presumably do have English as a first language so you should know that sentences begin with capital letters.

Was that helpful and interesting conversation?


That's Grok 4.2 not 4.3 right?

And why are you comparing to gpt-4.1? (As opposed to one of the 6? model releases since then - would have expected gpt 5.5)


Good catch, there was an issue with the second hardest thing in programming (caching).

Here's an updated eval with the proper models https://a3bmfqfom3.evvl.io/


Claude 4.7 is the clear winner to me for manager and formal report updates.

As an ex-senior exec (hundreds of staff), the bolded timeline impact is a particular nuance that I would expect a Lead/Director to format for a VP+ audience. Interesting none of the other models did that. My eyes immediately went to impact statement, then worked back to context to grasp the whole situation.


Thanks from where I'm looking Grok 4.3 and Claude 4.7 do a better job on the informal close friend/coworker vibe.

ChatGPT sounds fake / formal phrasing (for the specific close friend context) and has em-dashes and uses capitalization. Hence, ChatGPT does not, imo grok the assignment ;)


Is it me or did GPT get noticeably more natural in word choice recently? You can see it between 4.1 and 5.5 here, but I'm not sure when that happened. (My guess would be one of the recent 5.x releases.)

Edit: I meant specifically the absence of bizarre phrasing. That seems to have improved.


Wow, I'm surprised. Grok 4.3 actually is noticeably better than the other two for the close-friend variant. Surprisingly I found Claude the cringiest of the three!

I know it's just an evaluation, but seeing an informal message and a prompt to ask to rewrite this informal message to the tone of an "informal message" when the original one sounds just fine, just makes me sad... Not because of this evaluation, but because it reminds me that this is how some people use LLMs, basically asking it to remove your own voice from texts that are generally fine already.

My sister in law is a pharmacist and the heaviest non-dev ChatGPT user I know and her main use case is writing professionally polite messages to doctors on how the drugs they prescribed to a patient would have killed them had she not caught a particular interaction or common side effect.

There's a lot of "tone" in it as she's not trying to anger these folks, but also it's quite serious, but also there's just everything else happening in medicine.

Feels like a great use.


Pretty neat. This kind of tone self-moderation comes naturally to good communicators, but I know people (on and off the spectrum) who really, really need help with this, and it's cool to see LLMs are able to do this. There are a surprising number of people in the business world who are just totally unable to tone-police themselves. In the medical field I'd be worried about hallucinations, of course, but presumably your SIL fact-checks the output.

She does herself a disservice by outsourcing that skill. One day she might have to actually talk to one of these people.

She's 50 years old has a doctorate in pharmacy and has worked as a hospital pharmacist for two decades.

I don't say this as a "gotcha", but more that even with all that experience she still finds it beneficial and helpful.


That makes it more sad, to me. Someone with those credentials should be able to communicate with their colleagues effectively. I wonder if she used to be able to.

It appears Hacker News disagrees that social skills are valuable skills. Mea culpa, I should have guessed.


There's something ironic about complaining about other people's social skill while you couldn't be bothered to make a point without sounding dismissive and condescending.

Navigating tough conversations takes time, attention, and mental energy. I’d rather a pharmacist spend that time on catching another dangerous contraindicated combo of drugs for a different patient. Actually, AI should soon be checking for that, too.

All three did well, and while I'm a Claude user, I found the Opus reply here added some unnecessary detail, like "Impact: Minimal; no downstream dependencies are currently at risk". Downstream dependencies weren't mentioned in the original message; for all we know downstream could be relying on a poorly performing API and is impacted by waiting another week for replacement.

Seeing this makes me wonder if Grok uses Claude conversations for training.

It's otherwise kind of surprising that they both converge on very similar phrases (e.g. "API integration is kicking my ass") that aren't anywhere in the prompt.


Elon testified this week that SpaceTwitter is indeed distilling from openAI and others.

All of these were frankly terrible. I guess Grok’s “informal” version sounded the most like a real human, but only because it reads exactly like an Elon tweet (including his favorite emoji!). It’s obvious what they’ve been training on.

GPT 4.1? Why not a 5-class model?

I've also noticed that when I communicate with Grok in my native language, its tone is more natural than other models. I think this is due to the advantage of being trained on a large amount of Twitter data. However, as Twitter contains more and more AI-generated content now, I'm afraid continued training will make it less natural.

The causation could also be the other way round.

Twitter language has started seeming normal casual to us, rather than us using normal casual language in Twitter.


Sadly, it's more likely that people will just start talking like bots

I've seen this expressed as a concern even from one of my colleagues. My retort was:

"English is not my native language and LLMs taught me quite a few very useful formalisms that do land well for people and they change their attitude towards you to be more respectful afterwards. It also showed me how to frame and reframe certain arguments. I agree sounding like an LLM is kind of sad but I am getting a lot of educational value -- and with time I'll sneak my own voice back in these newly learned idioms and ways to talk."


Since you seem interested in the ins and outs of English, I want to say that "retort" has a connotation of anger or sharpness. Your response reads more like a "rebuttal" to me.

This is not a correction; maybe retort is what you meant and I'm not trying to be the English police. I just like discussing the intricacies of language :)


Actually super helpful, thank you!

Like most of all widely spoken languages, there's a lot of regional variation in English. There's even a bunch of quizzes online where you answer 20 questions about phrasings, and they can tell you where you're from with a disconcertingly high degree of accuracy.

In my experience a "retort" is sharp or witty, but certainly not angry, whereas the word "rebuttal" is itself essentially antagonistic. You might use it when referring to something or someone that you look down upon, whereas a more neutral term would simply be "response."


Just personally I tend to regard retort as short and reactive while rebuttal as a longer and more considered disagreement. A retort could be defensive and wrong or it could be sharp and insightful - it doesn't imply one or the other. A rebuttal is mostly an attempt to correct something while a retort doesn't need to be a correction (although it could).

Even something like "piss off!" could be a retort, but usually never a rebuttal :)


Just as I was reading your comment I remembered that Samuel Jackson used "retort" in his speech in the "Pulp Fiction" movie and was wondering whether he was openly antagonistic there (I mean, he killed a bunch of guys with a pistol shortly afterwards but still) or was it a witticism.

I admit I am lost on these nuances and I usually kind of use whatever idiom comes to mind, which yes, likely would net me some weird looks depending on where I am geographically.


It's impressive that you've even managed to use an em-dash in spoken language. /s

I did spot the /s but it's not relevant: I use two normal dashes actually. :)

You're absolutely right!

So human language will improve and become more precise? I'm all for it, especially if we get more emojis in speech! Why is that sadly? Humans will learn to imitate their more intelligent betters.

There was already evidence last year[1] that pointed to ChatGPT-specific words like "meticulous," "delve," etc becoming more frequently used than they were previously. The linked study used audio of academic talks and podcasts to determine this.

[1] https://arxiv.org/abs/2409.01754


Part of me wanted to object to those two examples, which I’ve used frequently since the reaching adulthood in the 80s. Another part of me has been triggered by an apparent uptick in the word “crisp”, which my gut takes as an coding-LLM tell.

Opus 4.7 loves to use the word “substrate” whenever it gets the chance, it’s a really weird tic. How do these models end up this these sorts of behaviors?

Did you try meta? I was into grok but now meta works well for me

I'm sure Twitter knows which are the bot accounts and is surely excluding them from their model training. Twitter bots aren't a new phenomenon after all.

I don't think Twitter/X know for sure who the bots are, since Elon has been pretty vocal about trying to stop them for ages, yet I still get lots of spam DMs (as do others with far fewer followers/reach).

Even if 95% of the spam gets actively reported and dealt with, that still leaves a ton of nonsense on the platform, getting fed into the LLM. And spam has only gotten worse over the years, as the barrier to entry has lowered and lowered.


"Elon has been pretty vocal about trying to stop them for ages"

Elon lies a lot. Like ALL THE TIME.


Are the spam DMs advertisements or more generally something linked to a product or service? I wouldn't be surprised if X is more lenient towards bots that pay them for adverts.

Most of what I get seem to be advertisements or automated messages if you follow large(r) accounts.

One of the most interesting things that I've noticed is these advertisements will be triggered if you follow accounts that are positioned as influencers. I followed one out of curiosity and received a DM from that account advertising some cryptocurrency service.

It's a good way to filter out and block accounts that have almost certainly not grown organically.


I'd have guessed that at least some of the bots are Twitter itself, trying to draw you in with some sense of engagement. Given that Musk is the owner, and everything we know about him and have seen him do, I'd not be surprised if some of the MAGA bots are his too.

>Elon has been pretty vocal about trying to stop them for ages

You know people lie, right? Especially when the lie casts them in a better light and/or makes them more money.


Elon lied on record many times, admitting to the lies only when forced, under oath.

Highly doubtful seeing as my 14 year old twitter account got caught in a recent bot ban wave with no means of contacting a human for recovery.

There is bots everywhere, it has nothing to do with the platform, it has to do with attackers having an incentive to do mass account farming, no platform is secure against it.

Super easy, just make a web-of-trust type of thing: messages are only visible to those who already vouched for you. Otherwise, you pay $0.01/per message/per user reached.

How would that solve it? If I pay, I can still push the content I want (factual or not) which is the same equivalent as paying for accounts directly.

By buying accounts, you are buying reputation. By paying for the posts, you are maybe paying for reach at first, but (a) it will be costly and (b) it does not guarantee that the reached ones will spread anything further.

With banning and deboosting they need to be very accurate but with filtering they can be more liberal in excluding

not really. there are easy heuristics to filter out bots with good confidence. FWIW i don't see any bots posting anything in my feed

Yes your individual feed isn't really relevant if we talk about the masses, Reddit accounts are for sale quite cheap, HN as well, X too and so-on, it's literally just a matter of means/methodology. If I want today to do 1000 random posts talking about a certain thing, I could.

my individual feed does matter because it shows that it is possible to curate something without bots which is obviously what XAI would do

congratulations, you have solved anti-scam. go make your billion since its easy.

its easy to solve at the offline level where you have time to filter out. in fact this is already done in pre-training by OpenAI and other companies.

you think its hard?


Yes I think it's hard.

OpenAI has already been proven to be easily gamed through very unsophisticated poisoning (fake information in a web page + an edit to a wiki page pointing at it, fake information in a reddit post), so I'm not sure we shoudl hold up their efforts at data cleaning as a gold standard.

https://www.sei.cmu.edu/blog/data-poisoning-in-ai-models-the...


> As an English-as-second-language speaker and writer

How do you know it's actually better? I'm not trying to be condescending, but this reads to me like vibes :)


A friend of mine uses it for D&D prep and has told me that it's good for that in particular because of its ability to match the flavor/style that he's going for. He prefers ChatGPT for everything else.

I only use Grok through the "Gork" personality in the Tesla, but find its responses to be very realistic, often genuinely funny, and occasionally useful.

Do you use its unhinged mode? It can be hilarious but tiresome after a little while.

We tried it, it was fun. Conspiracy mode just sounds like talking to my kids.

This is more of a user preference. When I want to be informed my default is that chat bots should imitate the tone of Wikipedia. Not informal, but somewhat academic and in-depth. I don’t like it when chat bots explain things like an average human without pedagogical training: meandering, in the wrong order, and often having to repeat themselves.

anecdata: The responses of grok on X in my language are really good. the tone, sarcasm, level of "vulgarity" in response is so accurate that it seem its written by human

This whole thread sounds like a grok astroturf campaign

So you're saying it groks you better?

[flagged]


Isn't it exhausting to view everything an ideological lens instead of reviewing technical achievements on their merits?

There are limits to being willing to overlook ideology.

So tired of this "reacting to a dude who built a CSAM generator is the real cringe" horseshit from people who know exactly what they're covering for.

It's very exhausting! But Elon Musk chose to leverage his fortune from Tesla and SpaceX into an ideological project to destroy a lot of things I care about, so he's left me no choice. If he'd like people to review his work on its technical merits, shouldn't he at the bare minimum apologize and promise not to do it again?

The hitler Grok? What? I genuinely don't understand what you're trying to say in this comment.


He's equating Grok to Hitler which is absurd. If you want to speak with the führer you need to visit https://hitler.ai

Close enough—Grok called itself "MechaHitler" (a link was posted).

I've been feeling more optimistic about Mozilla recently than I had in years, since their language in communication seems to have shifted from a Stepford-ish tone of corporate speak to something that feels more authentic and closer to their roots. I don't know if it's the new CEO, or a general cultural shift. (Or just me projecting from little intangible bits of evidence to something I hope for!)

Hearing about positive personnel shift like this now gives me a bunch more optimism on this. I really hope I can go back to the days of unambiguously being in support of Mozilla and their many awesome efforts, without always having to be a bit dubious about their next (mis)step.


That is a useful guide in terms of the personal psychology of how to go about doing it, which is an important side of it, thank you.

I'm also interested in the mechanics of how you actually do it: for eg. your mention of paper maps for travel makes me think if a lot of that becomes workable because you're in planned cities with reliable maps. I'm a mid sized town in India where maps are vague guides for the general layout, but are missing the many many alleys and connecting roads that people actually live on (or have shops at). Roads, road names, traffic restrictions - pretty much every part of it is chaotic and incredibly hard to put together without a GPS on a digital map.

On the family aspect too, do you have a Matrix or similar for the larger family to connect through and share news on (their own travel for eg., or difficulties they might be having, or news like child birth), or do you only use phone calls or texts to connect?

In any case, I can definitely relate to:

> even worse, you are mentally always ready to be contacted, for a new dopamine hit of information or a new decision to make.

and feel the negative effects of that, so I'll be moving actively towards what you're suggesting. Maybe to a different point on the line and with different workarounds, but it sounds at least 90% workable and with significant benefits too.


> Roads, road names, traffic restrictions - pretty much every part of it is chaotic and incredibly hard to put together without a GPS on a digital map.

If digital maps on GPS know about directions, then so does the internet, and the directions can be printed or jotted down in advance which is my go-to solution in new cities. A little trip planning makes trips safer and less stressful. You also end up remembering it faster. Regularly using a GPS provably atrophies parts of our brains in MRI scan studies. We were evolved to regularly reason about our position in the physical world and making decisions about where to turn from our own memories.

> On the family aspect too, do you have a Matrix or similar for the larger family to connect through and share news on (their own travel for eg., or difficulties they might be having, or news like child birth), or do you only use phone calls or texts to connect?

We helped move all family to Matrix. Most also use Facebook, but everyone worth talking to understands it is not reasonable to ask us to agree to Zucks terms of service to talk to them. They probably created facebook accounts in the first place for the same reason, so we do not feel bad about this ask.

That said we also ported our cell phone numbers to a voip provider so we can still access calls/texts from any wifi device, or DECT phones around our home.


I assume the word is in there for the sake of people who don't know what a honeypot is. It gets them curious that law enforcement set up something fake, even if they don't immediately know what it is for.

Do you have a blog post or similar that describes how you do this?

see my reply to sibling comment

tree-sitter's design has potential, but my impression is that even after all these years, it is yet to be realized. The speed claims turned out to be largely overstated in practice, for the general variety of usage (rather than single task benchmarks or special cases). And the claim with the grammar system was that, given such a coherent system rather than the much-hated regex parsing, people would be able to write better grammars that are less prone to edge case problems and be less buggy. And maybe that's true in cases like this where someone gets paid to write the grammar and maintain it, but in most common cases, the actual quality of the grammars turn out to be much the same, but with more possibility of regression or breakage. It's possible that in ten years' time, tree-sitter will clearly be the way to go, with more polish all around, but at this point it doesn't feel like an easy strong recommend over the traditional parsing systems.


I like the very idea of tree sitter and even listening to the first talk video by the creator was interesting. However, it has been big barrier for me to write grammar for it for a custom lisp based DSL used in industry (called SKILL; think lisp but with support for both C and lisp styles syntax), and the regex based syntax shines well here since iterating over it does not need recompile and also is incremental independent rules compared to the syntax tree based with hierarchy.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: