Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Things you shouldn't do: Two pointers in one field. (wikipedia.org)
42 points by RiderOfGiraffes on May 16, 2010 | hide | past | favorite | 20 comments


Someday you might have to program in an actual resource-constrained environment, and you'd understand why sometimes you have to do stupid things to meet your constraints.

Use half the memory -> buy a smaller memory chip per unit -> spend less per unit -> profit more per unit.

But maybe you should stick with the Ruby. I hear there's money in that.


Use half the memory -> buy a smaller memory chip per unit -> spend less per unit -> profit more per unit.

Or: use twice the memory -> buy a bigger memory chip per unit, adding one dollar to the cost -> charge $1 extra for your $300 unit -> spend less money defusing "your product sucks because it crashes all the time" support calls -> profit more per unit.

But maybe you should stick to hand-rolled assembly. I hear there's geek cred in that.

(But actually, there is a middle ground -- something like Atom: http://hackage.haskell.org/package/atom. You get type safety guarantees at compile time, and you get a minimal runtime, lowering hardware requirements.)


You seems to think most computers are complicated, with huge amount of memory needing dedicated and scalable components.

This is not the case. Most computers are tiny; the kind of "you've only got 1kB built-in and non-expandable memory" tiny (yet only if you're lucky). And most of those computers are indeed programmed correctly, despite using a number of classical trick like xor linked-lists (the program is small, so really checking it, even if it contains tricks, is doable).


> buy a bigger memory chip per unit, adding one dollar to the cost -> charge $1 extra for your $300 unit

Devices containing little enough memory where techniques like this are useful rarely cost $300 per unit.


There's no reason code using this technique needs to be buggy. You've got a list of things to test for right there.


One of the trickiest 'resource constrained' code I ever did was an index across 100,000 hotels in 64K of RAM on a whole bunch of criteria, 100's of them. From swimming pool presence to dog walking service (and pets permitted or not) and so on.

Obviously, you can't really do it so I first pre-sorted the hotels with some common criteria in large chunks, then sampled random bits from the rest of the criteria to fill the memory based portion of the index. When a query would come in a mask was made for the various criteria in the same way as during the indexing, and this mask was then used to get the indices of hotels that might be right, candidates so to speak. Then the candidates would be loaded from disk and checked against all the criteria and an answer would be returned to the caller. It worked pretty good, the disk block caching mechanism really saved the day.

This was on a QnX system before they had a 32 bit/large model version...

That's when programming is fun!

Of course 64K is still not really 'constrained' ;)


Sounds like you invented something like a Bloom filter, kudos!

http://en.wikipedia.org/wiki/Bloom_filter


Hey cool, thanks for digging that up, it certainly looks similar at first glance.

I can't tell you how many times I've had the feeling that finally I was doing anything at all that was original and then to find out (usually within five minutes, but this time at least a lot of years later) that it was already old hat when I was a toddler.

Oh well. Keep trying I guess.

One problem with stuff like this is simply to find what is out there given a problem description.


The name of it was on the tip of my tongue as I read your description, so I googled for "probabilistic constant-space test for membership", and voila, first result.


I bow to your google fu, that would have taken me a week.

At least.

If I would have found it at all. This is one of those areas where I think I can really feel that having English as a first language - or at least a degree in computer science - would have helped tremendously, just knowing the right terminology can be such a huge advantage.

I can parse your description of it but I could never come up with it in those terms. I'd be looking for "compressed index" or "candidate records algorithm" or something like that.


I thought "Bloom filter" too as soon as I read his description. They're used all over information retrieval. It's very common to need to hit the disk to say for sure whether an element satisfies your query, but the vast majority of elements don't. If you can eliminate those seeks entirely with a Bloom filter, your algorithm as a whole will run much faster.


Last time I actually needed to do this, I allocated memory in larger chunks instead, and made a linked list of those chunks.

The number of situations where this particular solution is the best solution is really small!


The name for this is an "Unrolled linked list," in the same semantic fashion as an unrolled loop. Congratulations!


But maybe you should stick with the Ruby. I hear there's money in that.

Come on, Mike. That sort of snipe just isn't like you.


This also reminds me of a story in the StackOverflow podcast (https://stackoverflow.fogbugz.com/default.asp?W24222), where programs executed as expressive bytecode ended up outperforming machine language programs on a low-memory machine, since the more verbose machine code executable was so much larger that parts of it had to be swapped from disk.

Not really at the same level of bit-tweaking as the XOR lists, but a nice bit of counterintuitiveness still.


Another reason not to do it is that in the modern era, computed pointers tend to stall pipelines.


If you're programming for an embedded system with severe memory constraints and a simple pipeline, this could be worth doing.


Haha, XOR'd pointers to compress a doubly-linked-list to use a single pointer for both previous and next references.

Brilliant and deadly. Gotta love it.


You can build a circular linked-list with this, provided you make "head" a pair of pointers and adjust your thinking appropriately.

The advantages are real, and not just space efficiency. You can traverse fwd and bck with identical code, potentially saving code space as well as data space. Although some other logic is more complicated, you might not need those other functions.

Don't forget, many embedded processors have moderate ROM and incredibly restricted RAM. Saving RAM at the expense of ROM can be a sensible trade-off.


Let's just hope somebody doesn't try to get sneaky and make a circular XOR linked list. It'd be hard to start a traversal from head = )




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: