Below freezing is a concern that everyone has in Northern Europe, particularly Scandinavia which has very high per-capita adoption. The units might be harder to find in the US, but they definitely exist.
If you can afford it, and have the land access, you could install a ground-source pump which should benefit from more stable temperatures. As with all heating/cooling, these systems work best if your house is well insulated. That's a much bigger problem in the UK, and I imagine the US too, especially in places where solar gain requires a huge amount of A/C usage.
Northern Europe tends to have a mild climate in the places where people live. The Northern US is significantly south, yet gets significantly colder winters. There are places in Europe that get worse than the Northern US - but they are places where few people live and so not normally what you are talking about when talking about Europe.
Though good heat pumps are hard to find in the northern US. Most installers only know of gas furnace + A/C, and don't even try to install anything else. As you get farther south in the US heat pumps become common, but there it rarely gets much below freezing and so they don't need backup heating systems at all.
that's odd because Ontario and Quebec are colder than most of the US and just these two provinces account for about one million heat pump installs per year since 2020.
Heat pumps can work great in the northern us. However the experience of europe mostly doesn't apply. You need backup heat of some sort to use a heat pump.
Cold climate heat pumps work just fine at even -30 C. In fact, many malls and offices in Northern Ontario are switching to heat pumps only because electricity is so cheap. What heat pumps really need in cold climates is reasonable insulation.
> Also even if it is open source, who really verifies the binary is built from the source published?
Apple notarization is usually the way for non Store downloads. Non-notarized apps present a warning and require overriding security settings to run (with admin privilege). There's nothing inherently stopping someone from notarizing code A and putting code B on GitHub, only that some sanity checks have been performed and the binary is not a known threat (or has been modified).
> There's nothing inherently stopping someone from notarizing code A and putting code B on GitHub
Sorry what if the open source project made their CI/CD pipeline public? So users could exercise it, produce their own build, and then compare that to the notarized one? Would I then be able to verify that what I downloaded from the developer’s website is identical to what is built with the open source code? Just curious.
Yeah there is support for API notarization, so in principle you could have an audit trail that some automated build process got a specific notary result that's "stapled" to the app. I'm not familiar enough to say how trustworthy that approach is, or what exactly you'd need to prove it. And yes, aim for a reproducible build that produces assets with checksums that can be matched to the distributed one.
The mitigation is if someone finds out a (notarized) download is compromised, they can tell Apple and they can retroactively and quickly revoke the signing which is distributed via Gatekeeper. Other users should get the warning if they had previously run the app without an issue.
In theory, yes, you could compare it. In practice, the build would need to be reproducible which is non-trivial depending on the size the of the project and the external dependencies the project itself has.
One really important factor is the grading curve, if used. At my university, I think the goal was to give the average student 60%, or a mid 2.1) with some formula for test score adjustment to compensate for particularly tough papers. The idea is that your score ends up representing your ability with respect to the cohort and the specific tests that you were given.
There were several courses that were considered easy, and as a consequence were well attended. You had to do significantly better in those classes to get a high grade, versus a low-attendance hard course where 50% in the test was curved up to 75%.
> One really important factor is the grading curve, if used.
I never use it to grade, because it is empirically unfair.
The further you move in the educational system, the less
people's aptitute matches a Gaussian or "normal" distribution.
(I also often fought a lot with management and HR when I was a manager in industry, as my team was hardly statistically normal (100% Ph.D.s from top places) imposing a Gaussian for bonus payments on a strongly left-skewed distribution is unfair. Microsoft introduced this and got into legal trouble, and many companies followed late and didn't realize the legal trouble part.)
It makes sense when applied across multiple instances of a test, if one cohort does terribly curve up, one really well curve them down relative to the overall distribution of scores.
But yeah within a single assignment it makes no sense to force a specific distribution. (People do this maybe because they don’t understand?)
Depends on the rigor. The typical grade school curriculum is expecting you to keep up and get 80-90% of the content on a first go. Colleges can experiment with a variety of other kinds of methods. It's college, so there's no sense of "standaridized" content at this point.
For some, there's the idea of pushing a student to their limit and breaking their boundaries. A student getting 50% on a hard course may learn more and overall perform better in their career than if they were an A student in an easy course. Should they be punished because they didn't game the course and try to get the easy one?
And of course, someone getting 80% in such a course is probably truly the cream of the crop which would go unnoticed in an easy course.
I think the prior probability in the bayesian sense is that the two entering cohorts are equally skilled (assuming students were randomly split into two sections as opposed to different sections being composed of different student bodies). If this were the case, the implication is that performance differences in standardized tests between cohorts are due to the professor (maybe one of the profs didn't cover the right material), so then normalization could be justified.
However if that prior is untrue for any reason whatsoever, the normalization would penalize higher performing cohorts (if it were a math course, maybe an engineering student dominated section vs an arts dominated cohort).
Right, and if it depends, maybe we just don't do it then?
Intuitively and in my experience, course content and exams are generally stable over many years, with only minor modifications as it evolves. Even different professors can sometimes have nearly identical exams for a given course, precisely so as to allow for better comparison.
Did the cohort due poorly or were the tests given to that cohort harder than in previous years? Or was the teacher a more difficult grader than others? You're jumping to the conclusion that the cohort was underperforming just because the grades were lower when other things out of their control could have been involved.
The university I went to had student run test banks of previous exams that the administration sanctioned. If the following year you get the same question as the previous year, then you’re going to do better than the year that got the first version of that question.
You’re also ignoring the human element of grading particularly in subjective parts of an exam.
The idea is to identify if there is a particularly easy/hard exam and the average score of the cohort is significantly different to how they perform in other classes. "Doing poorly" is quite hard to define when none of the tests, perhaps outside of the core 1st and 2nd year modules, are standard.
Not really since then all students can learn the exam as a template after 2-3 exams leak.
The curving I know at uni was targeting to exmatriculate 45% by the 3rd semester and another 40% of that by the end so the grades were adjusted to where X% would fail each exam. Then your target wasn't understanding the material but being better than half of the students taking it. The problems were complicated and time was severely limited so it wasn't like you could really have a perfect score. Literally 1-2 people would get a perfect score in an exam taken by 1000 people with many exams not having a perfect score.
I was one of the exmatriculated and moving to more standard tests made things much easier since you can learn templates with no real understanding. For example an exam with 5 tasks would have a pool of 10 possible tasks, each with 3-4 variations and after a while the possibilities for variation would become clear so you could make a good guess on what this semesters slight difference will likely be.
I know that's the argument but it just leads to grade inflation and a diluted signal for the students ability
Any specific uncurved grade is already ultimately adjusted by the being put in a basket of other grades that the student obtained across many courses, which are generally uncorrelated (or at least just as uncorrelated before curving as they are after)
The act of grading itself is what's wrong with colleges. Different people learn at different paces. Forcing everyone to work at the fastest rate and then judging them for not performing is what kills interest in subjects. People should be allowed to write tests when they want to, learn at the pace they want to decide for themselves when it's time to move on, because lets face it, not everyone cares about some prof's pet subject.
The problem is that higher education became something marketable and universities decided to sell diplomas instead of giving people a chance to learn skills they think might help them reach their goals.
Depends, is your goal in college to get a high GPA and look good for a job, or to truly learn and master content but not look as attractive on a resume without other projects?
Grading curves aim to mitigate punishment for the latter. It's part of why I could get a 2.5 GPA but still overall succeed in my career.
The foundational purpose of universities is truth-seeking, not job training. There's universities like Bologna, al-Qarawiyyin, Oxford or Cambridge that are more than 1000 years old.
The ultimate goal is knowledge cultivation.
You're more adapt to intellectual work only if you actually cultivate knowledge.
If all this college circus is, farming grades, then universities are ultimately failing at their job.
I don't disagree with you at all. But we both probably know that that isn't the reality as of the last half a century or so.I'd love to properly separate acedemia and create a bolstering apprenticeship/trades programs for several sectors to properly train a workforce, but there's basically zero momentum for that among white collar work.
Also note that GPA isn't just for jobs. Applying for school post bachelor's cares the most about a GPA. So a bad grade but learning a lot on a rigorous course can still male it hard to progress as a researcher or any other kind of specialized knowledge seeker.
Further reason to remove grades in university, if anything.
If job prospects are the focus then we should invest in proper trade schools detached from universities that focus on teaching marketable skills.
This is a thing in countries like Germany. My uncle works in maintenance of nuclear reactors there and he went through a trade school that focused on learning the relevant parts of the job.
It's one solution to a problem.
Which is that the results of tests are not strictly measuring how well the students understood the subject matter, but are heavily influenced by the quality of the rest and course as a whole.
That is generally hard to measure and frankly there is little accountability for bad courses. At the worst end you have bad profs who are proud of high failure rates because they don't understand it as a failure to teach but as a seal of quality how rigorous their standards are complex the subject matter is that they are teaching.
Not that grading on a curve solves any of that, but it eases the burden on students.
That won't work at elite schools like Stanford where a hard class average is like 98% and 94% will give you B+ due to the opposite curve being applied.
So another strategy to do well might include tempting your classmates to distraction or perhaps offering to "help" them but in fact feed them misinformation?
Got it.
You are typically the average of the people you keep around you. If you feel like you're going to get ahead by tricking your friends/peers then it likely means that you're not going to gain much when compared to the rest of the class (unless you're somehow able to deceive an entire class of 100+ students). On the flip side, if you and all your friends are supportive of each other then you're more likely to succeed when compared to the rest of the class. This does have the opposite effect of making it harder for students that don't have the same support/study groups but it goes completely against the point you're trying to make.
My suspicion is that it's because the feedback loop is so fast. Imagine if you were tasked with supervising 2 co-workers who gave you 50-100 line diffs to review every minute. The uncanny valley is that the code is rarely good enough to accept blindly, but the response is quick enough that it feels like progress. And perhaps an human impulse to respond to the agent? And a 10-person team? In reality those 10 people would review each other's PRs and in a good organisation you trust each other to gatekeep what gets checked in. The answer sounds like managing-agents, but none of the models are good enough to reliably say what's slop and what's not.
There is a real return of investment in co-workers over time, as they get better (most of the time).
Now, I don't mind engaging in a bit of Sisyphean endeavor using an LLM, but remember that the gods were kind enough to give him just one boulder, not 10 juggling balls.
It's less about a direct comparison to people and more what a similar scenario would be in a normal development team (and why we don't put one person solely in charge of review).
This is an advantage of async systems like Jules/Copilot, where you can send off a request and get on with something else. I also wonder if the response from CLI agents is also short enough that you can waste time staring at the loading bar, because context switching between replies is even more expensive.
Yes. The first time I heard/read someone describe this idea of managing parallel agents, my very first thought was that this is only even a thing because the LLM coding tools are still slow enough that you can't really achieve a good flow state with the current state of the art. On the flip side of that, this kind of workflow is only sustainable if the agents stay somewhat slow. Otherwise, if the agents are blocking on your attention, it seems like it would feel very hectic and I could see myself getting burned out pretty quickly from having to spend my whole work time doing a round-robin on iterating each agent forward.
I say that having not tried this work flow at all, so what do I know? I mostly only use Claude Code to bounce questions off of and ask it to do reviews of my work, because I still haven't had that much luck getting it to actually write code that is complete and how I like.
Built-in no, but the Pi Pico W is decent and inexpensive if the form factor isn't an issue. The RP2040/RP2350 are nice chips to work with and documentation is good. I can live with an external module, and it's certified too.
Have you tried it? It's simply not in the same league of battle tested as the ESP one is, and I will happily agree almost everything else about the RP based ecosystem is superior.
Yes, I've use them for ESPHome and other small jobs like lighting controllers, but not for production. They're cheaper than most Arduino or hobbyist breakout boards like Feather. I can't comment on battle-tested, but I've also bought some pretty shoddy ESP breakouts in the past and I've had trouble with unstable WiFi performance when I've meshed them. The PIOs are cool, and better documented than Beaglebone/TI (maybe that's improved). Toolchain is also decent.
I would probably go Atmega otherwise. It's rare I need something in the gap between 8-bit and a dedicated Raspberry Pi. And I'll take some rough edges to support a local company (though for transparency I do hold some stock in RPI).
Seagate Expansion drives are in this price range and can be shucked. They're not enterprise drives meant for constant operation, the big ones are Barracudas or maybe Exos, but for homelab NAS they're very popular.
I have such a NAS for 8 years, (and a smaller netgear one from maybe 16 years ago), and have yet such a disk fail. But you can get unlucky, buying a supposedly new but "refurbished" item via amazon or the seagate store (so I hear), or have the equivalent of the "death star" HDDs, which had a ridicilously high burnout rate (we measured something like > 10% of the drives failed every week across a fairly large deployhment in the field - major bummer.
If you use such consumer drives, I strongly suggest to make occasional offsite backups of large mostly static files (movies for most people I guess), and frequent backups of more volatile directories to an offsite place, maybe encrypted in the cloud.
Of course you would stagger the offline backups. But if we are talking storing e.g. movies, the worst case scenario is really not so bad (unless you have the last extant copies of some early Dr Who episodes, then BBC would want to have a word with you)
I've visited New Zealand a few times en route to Antarctica. The only time I've ever needed to take out cash was for the Christchurch bus service. I was in MIQ on the way in, but they gave us free reign on the way out because Antarctica was considered virus-free (and according to immigration NZ, it counts as the Ross Dependency). There was obviously a lot of push for contactless payments in 2021. I get the impression that the pandemic helped really cement it, although it sounds like the UK where we've had widespread contactless for almost 20 years.
The pandemic helped the banks push contactless which they love, because it’s not EFTPOS.
EFTPOS is our national post of sale system, it has very low or no fees for any party involved. Merchants pay a fixed machine rental per month which can include unlimited transactions, or may have a per transaction fee of up to $0.20. Most individuals do not pay a fee for using EFTPOS and there’s normally no card fee, though some banks have accounts with fees that have other benefits (eg higher deposit interest rates to encourage saving).
Contactless goes via the standard card network extortion. Since 2022 the interchange rate has been capped by legislation which has helped merchants a lot, but the per transaction fee over the card network is still far higher than EFTPOS.
Contactless EFTPOS does exist in Australia - we share a lot of the underlying tech - but the banks won’t activate it here because they’d lose the interchange fees.
Online EFTPOS is starting to gain market share though, which is nice.
"don't have to screw in every drive" is relative, but at least tool-less drive carriers are a thing now.
A lot of older toploaders from vendors like Dell are not tool-free. If you bought vendor drives and one fails, you RMA it and move on. However if you want to replace failed drives in the field, or want to go it alone from the start with refurbished drives... you'll be doing a lot of screwing. They're quite fragile and the plastic snaps easily. It's pretty tedious work.
I think pre-commit is essential. I enforce conventional commits (+ a hook which limits commit length to 50 chars) and for Python, ruff with many options enabled. Perhaps the most important one is to enforce complexity limits. That will catch a lot of basic mistakes. Any sanity checks that you can make deterministic are a good idea. You could even add unit tests to pre-commit, but I think it's fine to have the model run pytest separately.
The models tend to be very good about syntax, but this sort of linting will often catch dead code like unused variables or arguments.
You do need to rule-prompt that the agent may need to run pre-commit multiple times to verify the changes worked, or to re-add to the commit. Also, frustratingly, you also need to be explicit that pre-commit might fail and it should fix the errors (otherwise sometimes it'll run and say "I ran pre-commit!") For commits there are some other guardrails, like blanket denying git add <wildcard>.
Claude will sometimes complain via its internal monologue when it fails a ton of linter checks and is forced to write complete docstrings for everything. Sometimes you need to nudge it to not give up, and then it will act excited when the number of errors goes down.
Very solid advice. I need to experiment more with the pre-commit stuff, I am a bit tired of reminding the model to actually run tests / checks. They seem to be as lazy about testing as your average junior dev ;)
If you can afford it, and have the land access, you could install a ground-source pump which should benefit from more stable temperatures. As with all heating/cooling, these systems work best if your house is well insulated. That's a much bigger problem in the UK, and I imagine the US too, especially in places where solar gain requires a huge amount of A/C usage.
[1] https://www.cell.com/joule/fulltext/S2542-4351(23)00351-3
reply