Spinning disks. If you want raw file access, your ideal filesystem is a giant ke...

the8472 · on June 14, 2017

ext4 uses btrees for the directory index, so accessing files by name is just as fast as manually splitting it into a prefix-based directory tree.

Searching by filename on the other hand can be much faster with a prefix tree because directories only allow linear scans, not range-based ones.

ngrilly · on June 14, 2017

I'm missing something. What difference do you see between "accessing files by name" and "searching by filename"?

the8472 · on June 14, 2017

    f = open("/path/to/file");
    // do something with file

vs.

    d = opendir("path/to")
    while((dent = readdir(d)) != NULL) {
       if(!dent.name.startsWith("file"))
          continue;
       // do something with first matched entry
    }

The former is O(log n) on ext4, O(n) on older filesystems that use flat lists.

The latter is always O(n)

ngrilly · on June 14, 2017

Understood. Thanks ;-)

cm2187 · on June 14, 2017

Ok but listing 100 directories with 10,000 files each should take the same time than listing one directory with 1,000,000 files (what you are describing is a filesystem with more files vs less files, not with more subdirectories than less subdirectories).

hulahoof · on June 14, 2017

The benefit is with accessing single files and not requiring a 1,000,000 directory listing lookup