Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wrote a mostly-clone of Ack in C: https://github.com/ggreer/the_silver_searcher . Output format and most flags are the same. Besides the speed, most users won't notice a difference.

I spared no effort in optimizing. Pthreads, mmap(), boyer-moore-horspool strstr, it's all there. Searching my ~/code (5.2GB of stuff), I get this:

    ag blahblahblah  1.93s user 3.54s system 313% cpu 1.749 total

    ack blahblahblah  9.75s user 2.79s system 98% cpu 12.690 total
Both programs ignore a lot of extraneous files by default (hidden files, binary files, stuff in .gitignore, etc). The real amount of data searched is closer to 500MB.


Looks good, but from the doc I can't tell if it supports the second most useful feature of ack, that is scoped search:

    ack --ruby --js foo_bar
will search only ruby and javascript files, which means .rb+.erb+.rhtml+.js+...

Also exclusion with --no-* is very useful (especially --no-sql).

This is markedly different from 'simply' ignoring irrelevant files, besides the fact that it does not need a 'project' to work (ack --ruby foo_func $(bundle show bar_gem)).

The better part being it is extendable so that I can create --stylesheets covering css+sass+scss+less, or add say .builder to --ruby.

(BTW, love the name/command)


Only that, Ack's core strength over time will always evolve and depend on Perl's regular expression and text processing powers.

So re writing this in C will fundamentally mean endlessly growing a language which will look similar to the Perl implementation. Or a Perl DSL.

Not that its a bad thing, I find it interesting though. I would say you better start with a specification.


Ag supports the same regexes as Ack. I use the PCRE library. I only call pcre_study once, and I use the new PCRE-JIT[1] on systems where it's available. These tweaks add up to a 3-5x speedup over Ack when regex-matching.

1. http://sljit.sourceforge.net/pcre.html


If you use PCRE, you do NOT support the same regexes as Ack.

"Perl Compatible" isn't really Perl compatible, see http://en.wikipedia.org/wiki/PCRE for details.


Yes, there are a few edge cases, but hardly anyone uses those features. In fact, 90% of the time, most people seem to use literal string matching.


This looks fantastic. Could you by any chance update your PPA for quantal in the future?

EDIT: The last precise build works just fine, though.


Thanks so much, I was staying with grep precisely because of the performance and perl dependency of Ack. Does the silver searcher compile on win32 as well?


Per the README on the github page, instructions for building ag for Windows are here:

https://github.com/ggreer/the_silver_searcher/wiki/Windows

The author forewarns that "[i]t's complicated".


Since I added pthreads, there's no chance that it builds on Windows anymore. I don't have a Windows machine or VM to test stuff out on. Patches welcome, though!


Did you benchmark read() vs mmap()? Most tools seem to go with read() for grep-like io patterns.

In fact looks like GNU grep has --mmap switch and it's a little bit faster in the simple case than default on my Ubuntu system. But -i makes mmap slower. Maybe GNU grep just avoids mmap because of error handling (you get a segfault/bus error instead of an io error return when things go wrong).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: