Hacker Newsnew | past | comments | ask | show | jobs | submit | paxys's commentslogin

AI is an area where having decades of private data hosted and indexed by a third party is actually paying off with a direct return (vs just using it to surface ads). All moral qualms about FOSS and whatever else aside, asking a question in plain english and having an "AI assistant" digging through years' worth of photos, emails, events, chats, restaurant reservations and more and returning an incredibly detailed answer that no person ever could feels like the magic of tech being realized in front of our eyes.

Would I prefer this was all open technology instead? Yeah, of course. But it is abundantly clear that economic incentives don't allow open source to compete with the big players, and that's just how it is.


This is the daily "Google is bad" post. Best to ignore it and move on.

What is "clear room"? If he means clean room, no, this doesn't qualify.

I wish people would stop using this phrase altogether for LLM-assisted coding. It has a specific legal and cultural meaning, and the giant amount of proprietary IP that has been (illegally?) fed to the model during training completely disqualifies any LLM output from claiming this status.


And if you owe the bank a hundred billon dollars the entire economy has a big problem.

Very convenient to put "AGI" in all these agreements because the term is fundamentally undefinable. So throw out whatever numbers you want and fight about it and backtrack later.

> fundamentally undefinable

Incredible, how an entire religion has sprung up around AGI.


The definition used to be "passes the Turing test" .. until LLMs passed it.

Extremely debatable. Especially because there is no "The Turing Test" [0] only a game and a few instances were described by Turing. I recommend reading the original paper before making bold claims about it. The bar for the interrogator has certainly be raised, but considering:

- the prevalence "How many |r|'s are in the word 'strawberry'?" esque questions that cause(d) LLMs to stumble

- context window issues

It would be naive to claim that there does not exist, or even that it would be difficult to construct/train, an interrogator that could reliably distinguish between an LLM and human chat instance.

[0]: https://archive.computerhistory.org/projects/chess/related_m...


Sure, when the expected monetary value was 0. Then they started claiming that investing $1,000,000,000,000.00 (that's $1T) into a 4 year old startup was a good idea. Change the valuation, change the goal. Then the goal was be better than a human employees (or at least more efficient or even just improves efficiency) because without that the value of the LLM is far lower than what it is being sold as. All the research so far says that LLMs fall far short of that goal. And if this was someone else's money, fine. But this is basically everyone's retirement savings. Again, higher valuation, higher goal. Finally, when you start losing people's retirement savings, criminal penalties start getting attached to things.

I mean… just ask about something "naughty" and they'll fail? At the very least you'd need to use setups without safeguards to pass any Turing test…

The Turing test could also be considered equivalent to "can humans come up with questions that break the AI?" and the answer to that is still yes I'd say.


The problem with AGI is not that it's undefinable, but that everyone has a different one. Kinda like consciousness in that regard.

Fortunately, OpenAI already wrote theirs down. Well, Microsoft[0] says they did, anyway. Some people claimed it was a secret only a few years ago, and since then LLMs have made it so much harder to tell the difference between leaks and hallucinated news saying this, but I can say there's at least a claim of a leak[1].

[0] https://blogs.microsoft.com/blog/2026/02/27/microsoft-and-op...

[1] It talks about it, but links to a paywalled site, so I still don't know what it is: https://techcrunch.com/2024/12/26/microsoft-and-openai-have-...


Two economists were walking down the street when they spotted a giant dog turd on the ground.

One of them wanted to have some fun, so said to the other - "I'll give you $100 if you take a big bite of that turd".

His colleague figured $100 was a good chunk of cash, so did the deed. Feeling thoroughly humiliated, he pocketed the $100 and they carried on.

Further down the street they came upon another turd.

The angry economist now wanted revenge so made the same proposal back to his colleague, who also agreed and took a bite of the turd, earning back his $100.

Later one of them said to the other "you know, I can't help but feel we both ate shit for no reason."

His collegue replied "what do you mean? We raised the national GDP by $200."


The number is irrelevant. The fact is that work was done and was repaid with work.

Money was just the means of the transaction.


work good even if work literally eating shit

surely that behavior leads to a good society and doesn't encourage nefarious behaviors


> We raised the national GDP by $200.

Seeing this phenomenon, a silicon valley entrepreneur get an idea with the following sales pitch:

"Turd-bars that will make you the fittest version of yourself , answer all your deepest questions, and take you to the promised land (mars)."

Surprisingly, the turd-bars sell well, and GDP rockets up. Meanwhile VCs with fomo are funding its competitor: the shit-sandwich.


I did upvote, it's witty, but it's a bit of a misrepresentation of how the economy works.

In practice, people don't tend to pay people to eat shit without gain. You are paying people to help you. Money gaslights everyone into helping each other, the most selfish people become the most selfless.

Of course, real capitalism is much more complex and much uglier than this fantasy. When certain people end up with long-term control of large piles of money, the whole thing gets distorted. They get to make lots of money on interest without doing anything, and making other people eat more shit for scraps. That's the "capital" part of capitalism.

But the toy world-model that this joke is making fun of, is actually the one core positive aspect of capitalism and brings all the prosperity we have: tricking people into helping each other.


> the most selfish people become the most selfless

You reminded me of this Stewart Brand quote:

> Computers suppress our animal presence. When you communicate through a computer, you communicate like an angel.


I scratch your back for a $10M IOU.

You scratch my back for a $10M IOU.

The debts cancel out.

How is the economic gain calculated?


Big number gets bigger

> Maintainers: You’re a primary maintainer or core team member of a public repo with 5,000+ GitHub stars or 1M+ monthly NPM downloads. You've made commits, releases, or PR reviews within the last 3 months.

How many total developers does that cover? 100? How many of them aren't already corporate employees?

And also

> 6 months of free Claude Max 20x

So basically a free trial.

When Github Copilot first launched they gave Pro subscriptions to everyone that regularly committed to a public repo, regardless of the number of stars or downloads, and kept renewing it indefinitely. I don't know if that program is still around but it was amazing to get to try out some early LLM coding tools for open source development.


Github search gives me 11 300 results for 5000+ stars[0]. Dunno if they all qualify as open source, but that's also repos, not contributors. Presumably there's an average of > 1 per repo.

NPM probably adds a lot. I can't find any recent sources, but NPM packages get downloaded a lot (e.g., every Github Action run.) And to get such a download, an NPM package just has to be somewhere in the dependency tree, which are famously enormous. (Though many might not be updated in the past 3 months, though.)

[0] https://github.com/search?q=stars%3A%3E5000+sort%3Astars&typ...


A lot more than a 100, for once I'm one of those https://github.com/mickael-kerjean/filestash

> How many total developers does that cover? 100?

I love these questions bc they both can be answered with some slight heuristics, and they are quite surprising!

As of January 2026, there were > 13k npm packages w/ more than 1 Million monthly downloads [1]

Answering "how many total developers does that cover" is a lot harder (more expensive, rather, as I am not going to pay for the query on Google BigQuery to answer it, not after I spent $3k by accident last time doing similar exploration in the past)

I wont try to make a SWAG about how many devs have write access across those repos, but in the npm ecosystem alone I'm comfortable saying it is an order of magnitude more than 100.

[1] - https://gist.github.com/jonchurch/1dd845f4d26823fce5590af1aa...


Github is Microsoft. MS has a war chest big enough not to care if they throw away money for customer acquisition

Yeah, their thing is more making products worse over time and wasting billions. You will see this in action shortly with XBox. I think they will do both this time.

It's bizarre how they mention NPM for package downloads, and forget that other ecosystems exist too that aren't exactly small... PyPI, NuGet, Cargo, Maven, RubyGems, etc.

GitHub is cagey about the criteria, but yes this is ongoing. It doesn't appear to be tied to active contributions though. I'm a maintainer on paper of a moderately large open source project that I haven't been involved with in years, and they still renew my free copilot monthly.

I think there's plenty of them. I know at least 3 guys eligible for such requirements (but this guys aren't some public persons giving tech-talks and so on, just some niche libs for others to use). If Claude would ask for 100k stars repos, then yeah. I guess there would be even less than 100

Shucks, I'm only 1000 stars singlehandedly. Curse my woeful irrelevance :D

I guess I will just have to NOT sign on to this nonsense and allow it to atrophy my ability to think of things independently, thus ending up completely dependent on an outside tool of ever-increasing price.

Gosh darn it, of all the luck.


> a public repo with 5,000+ GitHub stars

This is going to get abused so fast, it will make your head spin.

EDIT: I just look up the highest-ranking "buy GitHub stars" page (which I will obviously not link here), and it looks like you would have to pay a little over $1000 to get the required amount of stars. So I suppose it might not get abused as easily as I thought.

On the other hand, someone with the gumption and elbow grease to abuse this process themselves could still easily do so, I'd wager.

All that being said, I still think that GitHub stars are effectively worthless, and attempting to assign value to them like this is, at best, a fool's errand.

I can imagine this will invoke Goodhart's law, increasing the amount of people shilling their AI-generated shovelware onto a Web already greatly suffering from the consequences of the plummeting cost of intelligent-sounding text generation.


There is no such thing as Uint8Array<T>. Uint8Array is a primitive for a bunch of bytes, because that is what data is in a stream.

Adding types on top of that isn't a protocol concern but an application-level one.


A Uint8Array can be backed by buffers other than ArrayBuffer, which is where the types [0] come from.

[0] https://github.com/microsoft/TypeScript/blob/924810c077dd410...


> Adding types on top of that isn't a protocol concern but an application-level one.

I agree with this.

I have had to handle raw byte streams at lower levels for a lot of use-cases (usually optimization, or when developing libs for special purposes).

It is quite helpful to have the choice of how I handle the raw chunks of data that get queued up and out of the network layer to my application.

Maybe this is because I do everything from C++ to Javascript, but I feel like the abstractions of cleanly getting a stream of byte arrays is already so many steps away from actual network packet retrieval, serializing, and parsing that I am a bit baffled folks want to abstract this concern away even more than we already do.

I get it, we all have our focuses (and they're ever growing in Software these days), but maybe it's okay to still see some of the bits and bytes in our systems?


My concern isn't with how you write your network layer. Use buffers in there, of course.

But what if you just want to do a simple decoding transform to get a stream of Unicode code points from a steam of bytes? If your definition of a stream is that it has UInt8 values, that simply isn't possible. And there's still gonna be waaay too many code points to fall back to an async iterator of code points.


I think we're having a completely different conversation now. The parent comment I originally replied has been edited so much that I think the context of what I was referring to is now gone.

Also, I wasn't talking about building network layers, I was explicitly referring to things that use a network layer... That is, an application receiving streams of enumerable network data.

I also agree with what you're saying, we don't want UInt8, we want bits and bytes.

I'm really confused as to why the parent comment was edited so heavily. Oh well, that's social media for you.


Not the person originally replying, but as someone who avoids JS I have to ask whether the abstraction you provide may have additional baggage as far as framing/etc.

Ironically, naively, I'd expect something more like a callback where you would specify how your input gets written to a buffer, but again im definitely losing a lot of nuance from not doing JS in a long while.


Study math/statistics/ML at a graduate level, to start.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: