Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Do you know any computer language where collections are unified?
6 points by scotty79 on Dec 25, 2021 | hide | past | favorite | 18 comments
In most languages creating array, list, set, mapping, usually requires different code to create them and to access them.

Do you know any language where you create generic collection and just specify what you require of it?

For example, you might specify that you require collection to be indexed with numbers, and have quick access to elements by their index. Or have ability to add element at the end quickly or have the ability to find next element quickly. Or have elements with quick parent access, or quick children access, or quick arbitrary inserts or you want collection to be indexed multidimensionally.

Language itself would pick the right algorithms for your collection for example picking fixed length immutable array (for the purposes of iteration), but also set (for quickly checking presence of element).

Creation, access and writing should be kept as similar as possible with different algorithms.

This system should of course be extensible. So that you could implement fast access to elements by creating and updating various task specific indexes into your data and have them available though built in language syntax.

The only thing that comes to my mind that somewhat attempted this is C++ template library, which offers various collections and iterators while trying to keep the interface mostly algorithm agnostic.



Clojure sequences (https://clojure.org/reference/sequences) provide some of what you are looking for:

"Clojure uses the ISeq interface to allow many data structures to provide access to their elements as sequences.

The seq function yields an implementation of ISeq appropriate to the collection.

Seqs differ from iterators in that they are persistent and immutable, not stateful cursors into a collection.

As such, they are useful for much more than foreach - functions can consume and produce seqs, they are thread safe, they can share structure etc."

Seqs do not handle the "here is my requirement, pick a collection for me" feature you want, but rather "here is a collection, run this algorithm"


Right, seq looks like a single way of accessing various collections when you need to access them in a specific manner.

If I want to access the collection iterating through it in read-only manner I'd just use seq. But I'm not guaranteed that ISeq I got will be fast. It will be as fast as possible for my collection, but what I want is to create a sequence where ISeq is fast, and also for example which I can access and modify by index and also where I can add values in front fast, and also possibly some other requirements.

Basically what I want is the ability to specify which operations I need to be fast at creation time, and get appropriate collection for that. And syntax of those operations should not change if requirements change.

Simpler way might be defining which data structures I want my collections to use, and get fast operations enabled by those data structures through common syntax.

For example:

    var mp = string[] is array, set, stack
    mp[5] = 'hello'
    mp.has('hello')
    mp.pop()
are all fast.


In Rust in theory you could express each capability a collection can have as a trait, impl the relevant traits for each container type, then write generic code that specifies it requires e.g. `NumberIndexable + FastAppend`. But AFAIK there is no way to ask the compiler to find the best available container type satisfying those traits; it would enforce anything you pass into the function satisfies them but wouldn't deduce it for you.

You could definitely do it with C++ template metaprogramming and type traits.


PHP lets you treat arrays and hashes exactly the same (or at least it used to) but I don't know if it switches the underlying data structures/algos based on usage.


I always considered it to be a very good decission in the design of PHP language.


This was a terrible decision, it causes so many functions to have different arguments to handle both data structures. It also slower performance wise


Since it's a scripting language made first and formost for accesibility not for speed it was a great decission. Because you could create nearly all the data structures you needed using just array() and many of them superbly easily.


There's a few different aspects to your question.

> For example, you might specify that you require collection to be indexed with numbers...

You can't hide that, because in at least some of the uses of that collection, you're going to index it by numbers.

Other than that kind of detail, though, I think you can create collections that you can use without knowing the details in Haskell. To a lesser degree, you can in C++.

The last detail is creation. You might be able to create a "collection factory" to create it for you. But I'm not sure that it's any different to say "CollectionFactory.createCollection(FAST_FIND)" than it is to say "new HashMap()". I mean, the former means that you don't have to know that a HashMap is the collection you're looking for, so that's something. It's not all that much, though.


But I want to be able to say:

CollectionFactory.createCollectionFast(FIND<string>, FIND<int>, STACK_ACCESS, KEEP_INSERTION_ORDER, REVERSIBLE)

This might require resulting collection to use multiple data structures in parallel to provide me with all the fast capabilities I listed.


Surprisingly, one "language" actually fits pretty well (most of) your requirements: SQL :)


What I have in mind is a general purpose language that you could build relational database in, that would use the same language syntax for queries that this language uses for simpler collections like array or set.

When you try to use SQL as a general purpose language syntax is not that great and not that unified.


Core Foundation (Apple’s C library) sort of uses this with its “array” type: https://ridiculousfish.com/blog/posts/array.html


Lua and PHP as the biggest ones. They both unified tables and arrays.

But you can do that with every library by yourself.


I'm probably wrong, but I feel like the Scala standard library is pretty close to what you are describing.


This is really super close.

Interface is really consistent, for example List and HashSet both have diff operation.

You could probably write a function that will supply you with the collection best fitting your requirements.

What I'm missing is combining multiple collections into one variable for sets of requirements that no single collection satisfies.

Also I'm not sure if Scala syntax could support other ways of accessing than simple indexing, like for example multi-dimensional indexing and slicing or even slicing by some inequality conditions like SQL does (with use of indexes into data). Basically if you could write LINQ and J in Scala syntax.

Even if not, the Scala collections look amazing and thank you for pointing me in the right direction!


After further inspection of Scala 3 capabilities, especially code quoting and splicing it seems that Scala 3 has all the building blocks necessary, base collections with consistent interface and syntax expressive enough to take it where I want it to be (Linq and J).


Lua metatables might fit the bill.


Lua metatables are quite interesting and you could probably implement a lot of the functionality I described in hacky/magicmethody manner.

I'm now looking at Scala collections library and it seems that large part of what I'm looking for was actually overt design goal of people who made it. They also thought about making customized collections easily with design that enables very aggressive code and interface reuse.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: