[Geany-devel] Find in files - Re: Patches required by gproject

Nick Treleaven nick.treleaven at xxxxx
Tue Jun 22 11:46:58 UTC 2010


On Mon, 21 Jun 2010 20:53:58 +0200
Jiří Techet <techet at gmail.com> wrote:

> >> Having a filetype pattern in the find in files dialog could be useful.
> >>  Note that --include is a GNU grep extension, so a blank file pattern
> >> should be the default and should not generate the option to grep so as
> >> to maintain portability.
> >
> > Geany currently passes a list of all files and directories to Grep. I
> 
> Does it? To me it looks it only calls grep in the given directory.

If the -r (GNU extension) recursive option is set, it just passes '.'
as the path instead of any filenames.

Without -r, I'm not sure how to call Grep without passing a list of
files. Maybe I'm missing something, but that was why I implemented
passing all filenames. (See how search_get_argv() is used).

> > think it may be best if Geany does the filtering (and hence also the
> > recursing). Also we may want to always filter out hidden files and
> > broken links.
> 
> There is one problem here - the command line may be too long. By posix
> ARG_MAX is at least 4096 but this will be definitely too little for
> thousands of files. There are three options how to solve this:
> 
> 1. Call grep separately for every single file. This is too slow. I
> tested something like that some time ago and for about 10000 source
> files it takes 30 seconds just to execute grep so many times. To find
> e.g. "torvalds" in all *.c;*.h files of linux kernel using -r and
> --include it takes only 2 seconds (the first search is always slow
> because the files have to be read from disk, but any subsequent search
> is really fast as the files are cached by the OS). The speed is very
> important for me.
> 
> 2. Use xargs - this introduces one more external dependency for geany
> so it probably isn't the preferable solution.
> 
> 3. Implement some alternative to xargs and call grep repeatably only
> for as many files that can be passed on command line.
> 
> Only (3) seems to be a reasonable solution but it means some extra
> work. Right now I find the easiest way to implement it using --include
> - if no pattern or * pattern is specified, --include will be omitted
> (as Lex suggested) and no error reported if the grep doesn't support
> it.

AFAICT, you still have to pass filenames even when using --include
(when non-recursive).

Personally we've not had complaints about the argument length limit, so
I'm not too bothered about it yet. Also modern systems may extend the
POSIX limit, but obviously we can't rely on that. Maybe wait and see if
it's a problem?

Regards,
Nick



More information about the Devel mailing list