On Tue, 22 Jun 2010 15:19:02 +0200 Jiří Techet techet@gmail.com wrote:
think it may be best if Geany does the filtering (and hence also the recursing). Also we may want to always filter out hidden files and broken links.
There is one problem here - the command line may be too long. By posix ARG_MAX is at least 4096 but this will be definitely too little for thousands of files. There are three options how to solve this:
...
- Implement some alternative to xargs and call grep repeatably only
for as many files that can be passed on command line.
Only (3) seems to be a reasonable solution but it means some extra work. Right now I find the easiest way to implement it using --include
- if no pattern or * pattern is specified, --include will be omitted
(as Lex suggested) and no error reported if the grep doesn't support it.
AFAICT, you still have to pass filenames even when using --include (when non-recursive).
Personally we've not had complaints about the argument length limit, so I'm not too bothered about it yet. Also modern systems may extend the POSIX limit, but obviously we can't rely on that. Maybe wait and see if it's a problem?
The problem is already here for big projects. You don't see it if you call it for a single directory - the number of files in one directory isn't usually so huge but if you recursively search the whole project directory structure, you may get a lot of source files. At work I use SLES9 which has ARG_MAX=131072 and the project has about 8000 source files. If you assume the path is about 40 characters (can be more with projects with deep directory structure), you get 8000*40=320000 and you're over the limit.
OK, so we should keep the current behaviour of not passing filenames for a recursive grep.
Personally I find using the gnu extensions much more safe. After looking at grep's git history, I found that --include was introduced in 2001 (GTK 2 didn't exist at that time and linux kernel 2.4 was just freshly released). On the other hand the ARG_MAX limit was increased quite recently in kernel 2.6.23 in 2007 (see http://www.in-ulm.de/~mascheck/various/argmax/ to see the limits for other platforms) and older kernels are still in use. I'm also not aware of any non-gnu implementations of grep (that might not implement --include).
To me it looks much easier to use what's already present in grep right now and if this isn't satisfactory, it could be reimplemented the way you propose in the future. In addition if no patterns are specified, geany will behave the same way it did until now so there is a sane fallback solution for people with very old grep.
Let's use --include then if the user ticks a checkbox. That way a non-gnu system will still be able to do non-recursive all files search.
Regards, Nick