Fixes #1069 by implementing the suggestions there.
See commit messages for details. tl;dr: the goto-symbols popup will show more of the paths, but as little as possible. The common prefix is stripped and the longest common sub-path is ellipsized.
Example: Assume the popup would show utils.h twice (/home/kugel/geany.git/src/utils.h and /home/kugel/geany.git/build/dest/include/geany/utils.h.
The popup would show: src/utils.h build/dest/include/geany/utils.h
Additionally, as per @elextr suggestion and for a frequent use-case of mine, the ellipsis is introduced for long common substrings, which I often have due to having the same code base checked out multiple times (my workflow at work requires this).
So, /home/kugel/checkout_a/path/to/project/src/main.c and /home/kugel/checkout_b/path/to/project/src/main.c shows as: checkout_a/.../main.c checkout_b/.../main.c You can view, comment on, or merge this pull request online at:
https://github.com/geany/geany/pull/1445
-- Commit Summary --
* utils: add functions to process file name list * symbols: provide a bit more path information in the goto-symbol popup. * gtkdoc: add support for array annotions * api: export new utils_strv_shorten_file_list() function
-- File Changes --
M doc/Doxyfile.in (2) M scripts/gen-api-gtkdoc.py (7) M src/plugindata.h (2) M src/symbols.c (13) M src/utils.c (193) M src/utils.h (2)
-- Patch Links --
https://github.com/geany/geany/pull/1445.patch https://github.com/geany/geany/pull/1445.diff
elextr approved this pull request.
@kugel- sorry I give up reviewing this, can't figure out the G* macro shit. Will approve so someone else can review it.
- */
+static gchar *utils_strv_find_common_prefix(gchar **strv, size_t num) +{ + gchar *prefix, **ptr; + + if (!NZV(strv)) + return NULL; + + if (num == 0) + num = g_strv_length(strv); + + prefix = g_strdup(strv[0]); + + for (gint i = 0; prefix[i]; i++) + { + foreach_strv(ptr, &strv[1])
why is this here?
@@ -2031,6 +2031,199 @@ gchar **utils_strv_join(gchar **first, gchar **second)
return strv; }
+/* * Returns the common prefix in a list of strings. + * + * The size of the list may be given explicitely automatically determined if passed a GStrv.
what you actually mean is a null terminated pointer array, GStrv is just a char**.
if (prefix[i] == '\0')
+ break; + } + return prefix; +} + +/* * Returns the common prefix in a list of strings. + * + * The size of the list may be given explicitely automatically determined if passed a GStrv. + * + * @param strv The list of strings to process. + * @param num The number of strings contained in @a strv. Can be 0 if @a strv is a @c GStrv + * + * @return The common prefix that is part of all strings. + */ +gchar *utils_strv_find_lcs(gchar **strv, size_t num)
whats the difference between this and the previous function
codebrainz commented on this pull request.
- */
+static gchar *utils_strv_find_common_prefix(gchar **strv, size_t num) +{ + gchar *prefix, **ptr; + + if (!NZV(strv)) + return NULL; + + if (num == 0) + num = g_strv_length(strv); + + prefix = g_strdup(strv[0]); + + for (gint i = 0; prefix[i]; i++) + { + foreach_strv(ptr, &strv[1])
Guessing it's accidental since it's indented wrong and missing braces, and `strv` can be less than two elements long and not necessarily nul-terminated.
A bit off-topic but what does this weird macro do compared to the normal C code `for (GStrv ptr = strv+1; *ptr; ptr++)`?
can't figure out the G* macro shit.
which G* macro shit? `foreach_strv` is a Geany macro. If it's hated it should be deprecated and/or removed.
kugel- commented on this pull request.
- */
+static gchar *utils_strv_find_common_prefix(gchar **strv, size_t num) +{ + gchar *prefix, **ptr; + + if (!NZV(strv)) + return NULL; + + if (num == 0) + num = g_strv_length(strv); + + prefix = g_strdup(strv[0]); + + for (gint i = 0; prefix[i]; i++) + { + foreach_strv(ptr, &strv[1])
Err, this line should be simply deleted.My mistake.
kugel- commented on this pull request.
if (prefix[i] == '\0')
+ break; + } + return prefix; +} + +/* * Returns the common prefix in a list of strings. + * + * The size of the list may be given explicitely automatically determined if passed a GStrv. + * + * @param strv The list of strings to process. + * @param num The number of strings contained in @a strv. Can be 0 if @a strv is a @c GStrv + * + * @return The common prefix that is part of all strings. + */ +gchar *utils_strv_find_lcs(gchar **strv, size_t num)
Oops, I coped the docs but then didn't change it. Will fix.
kugel- commented on this pull request.
@@ -2031,6 +2031,199 @@ gchar **utils_strv_join(gchar **first, gchar **second)
return strv; }
+/* * Returns the common prefix in a list of strings. + * + * The size of the list may be given explicitely automatically determined if passed a GStrv.
I was under the impression that GStrv also implied NULL termination since it's expected by `g_strv_length` and `g_strv_contains`. I will change the comment.
@kugel- pushed 1 commit.
74eb115 Fix one oops and a couple of mistakes in comments, found by review.
@kugel- to be clear, nothing I said is a general criticism of the purpose your PR or the implementation (apart from the specific comments you have already fixed).
I just found it difficult to understand it, not aided by lower case macros that "capture" following syntax. Also as you pointed out above, not only is it lower case, it is an unprefixed Geany local macro. So I was not confident I understood the implementation, and couldn't justify spending more time on it. Thats why I approved my review, so someone else could review it and not need to override my review.
Yes I believe macros like foreach_strv and foreach_document etc should be removed, but thats not part of this PR, will open another issue for discussions about that.
@kugel- pushed 1 commit.
c096d46 api: pass gtkdoc annotation parameter as-is
@elextr @b4n @codebrainz Can someone please review this?
@elextr @b4n @codebrainz ping
I gave this PR a brief spin. Linux and Windows, GTK+2 and GTK+3, a few realistic and contrived examples.
-----------------------------------------------
**Bug:** on Windows, if one of the pathnames to be rendered in the popup is too long, it is considered empty. This happens before the removal of common prefixes, so it looks like this:
![geany-goto-popup-cut](https://user-images.githubusercontent.com/300211/27780482-d975321e-5fd1-11e7...)
This first line “: 1” actually refers to a very long pathname. If I click it, Geany successfully jumps to the symbol in that pathname, but also prints this to the console:
Geany: utils_tidy_path: assertion 'g_path_is_absolute(filename)' failed
So what is a “very long pathname”? Here is an example:
C:\Users\Vasiliy\Downloads\Первая папка\В ней много интересного\Приколов не счесть\Всякое разное 1234567890\Ещё больше 1234567890\Последний уровень вло\Ор выше гор.c
As you can see, it’s 165 characters long, but in UTF-8 it would be 261 bytes long. Now this is the minimum length where the problem happens. If I remove any **1 Cyrillic** or any **2 Latin** characters from anywhere in this pathname, it is rendered as expected.
I could not reproduce this problem with ASCII-only pathnames, probably because Windows prevented me from creating a pathname longer than 255 characters.
I could not reproduce this problem on Linux.
-----------------------------------------------
When common substrings are ellipsized, Geany now prints messages to the console, like this:
``` p 0 Вторая папка/.../Тест.c p 1 Первая папка/.../Тест.c ```
I think this should be either removed or made more understandable.
@vfaronov Many thanks for testing.
Regarding the bug. I assume PATH_MAX is just 260 on Windows. I've read http://insanecoding.blogspot.de/2007/11/pathmax-simply-isnt.html and it appears PATH_MAX is pretty broken in general.
The tag data contains the full path name. Apparently it contains something weird (non-null) if it exceeds PATH_MAX, while from the code I'd think it contains NULL (but that would crash Geany). To be sure, please check with the following diff.
In any case, this is a separate issue. We have our own `realpath()` on windows which doesn't handle this case. The the underlying data this PR uses is already wrong. You should see the error message about `utils_tidy_path()` even without my patch.
``` diff --git a/src/symbols.c b/src/symbols.c index f4f259b72..874666399 100644 --- a/src/symbols.c +++ b/src/symbols.c @@ -1933,8 +1933,12 @@ static void show_goto_popup(GeanyDocument *doc, GPtrArray *tags, gboolean have_b /* If popup would show multiple files presend a smart file list that allows * to easily distinguish the files while avoiding the file paths in their entirety */ file_names = g_new(gchar *, tags->len); + int n = g_random_int()%100; foreach_ptr_array(tmtag, i, tags) + { file_names[i] = tmtag->file->file_name; + printf("goto_popup %d: %s", n, tmtag->file->file_name ? tmtag->file->file_name : NULL); + } short_names = utils_strv_shorten_file_list(file_names, tags->len); g_free(file_names);
```
To be sure, please check with the following diff.
I get an empty string for the path (`goto_popup 24: `).
You should see the error message about `utils_tidy_path()` even without my patch.
I do. In fact, I can’t even open two different files with broken paths like that, as if Geany considers them to be the same file.
@vfaronov Can you please open a new issue for this? Maybe we can find a solution indepedently.
@kugel- [Done.](https://github.com/geany/geany/issues/1534)
@kugel- pushed 1 commit.
47f4392 Remove extraneous debug output.
@elextr @b4n @codebrainz please review and merge.
@elextr @b4n @codebrainz ping
elextr commented on this pull request.
menu = gtk_menu_new();
+ /* If popup would show multiple files presend a smart file list that allows
`presend` sb `present`?
@kugel- I got the same reaction looking at it again as last time, so I am never gonna "review" it, and I don't use the gotos much so it isn't going to get tested much here either, but given that @vfaronov has tested, it I might be willing to commit it so it can be tested by others. The only question is for windows, can it at least show the filename if the path is too long, not just `: 1`, then at least its not worse than it is now :)
@elextr For windows, the issue has nothing to do with my change. The tags contain the wrong path (or no path), even without my change. The display issue is present in master, too.
@kugel- ok, if the windows problem is an existing issue, then if nobody objects loudly over the next week, then I will be willing to commit it if you fix the conflicts.
@elextr the only conflict is about the API version increment. I thought it was generally practice that this kind of conflict is handled upon merge (since otherwise PRs have to be constantly updated against API version changes)
I'm in the process of reviewing this, and think that maybe the code could be a little simpler. I'll post a review and suggestions in the upcoming days.
@b4n JUST sneaked inside the week, sorry @kugel- no commit.
Am 16. September 2017 11:10:42 MESZ schrieb elextr notifications@github.com:
@b4n JUST sneaked inside the week, sorry @kugel- no commit.
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/geany/geany/pull/1445#issuecomment-329956618
No problem. An in-depth review is always welcome.
@b4n What's your status?
@b4n ping
@b4n ping
@b4n ping
@b4n ping again...you are active on many other PRs, yet you ignore this one even though you said promised to review. Makes me sad.
I'm sorry. I don't have much time lately, and the reason I'm active one other ones and this one is unfortunately that this one is trickier and thus requires more time commitment than more trivial issues. To try and get it moving again even with limited time, I'll drop some alternative code I wrote a while back. I wanted to post it integrated and with proper rationale, but well, you're still waiting. So just dropping it is probably better than nothing.
https://gist.github.com/b4n/0f715c19239f501200cfeaefa5a6979c * IMO (at time of writing I thought it was, at least) easier to understand * Probably more resources-greedy (likely more memory, likely more CPU), but I don't think it matters at this level * Fixes quite a few bugs in shortening. I unfortunately forgot what they all were and would need to find them all again, but IIRC the implementation proposed in the PR can count and/or remove sequences not entirely composed of full directories (meaning it could remove half a directory, or nothing at all when it should, with specially crafted path lists sharing longer substrings outside of directory portions or alike). You probably can try the same kind of things I did using a similar sample program as the one in my proposed implementation.
@b4n to be honest, I prefer my version. I find it easier to grasp because it's split into 3 strv-related utility functions that have well defined and separated functionality (the functions should be useful on their own). I don't so how your code is really any shorter (and neither is worryingly long, each is well below 200 LOCs with 20-30 of them being comments). And it's more efficient. I'm surprised, as you usually seem care about efficiency? Efficiency is normally not a big deal for me, but I care if the more efficient version is already written (or if the change is trivial and/or obvious but that doesn't apply here).
So I would rather fix the bugs you mention and proceed with my version. What do you think?
Well, yeah I don't like inefficient code when it can easily be improved in that aspect. OTOH, I prefer easy to read/maintain code.
And I'll have to check again, maybe I was tired, but I found your code oddly hard to follow, and tricky to fix the issues I found. At the time I found that working on path components directly made everything simpler at many level.
And for the matter of otherwise useful reusable functions, yes, but for example I think I remember one bug is that the longest common substring is *not* necessarily what you want here, like in */a/b/c/d/longilename* vs. */z/b/c/y/longfilename*: *longfilename* is obviously the longest substring, but you actually want */b/c/* -- so IIRC your code will just not shorten this because it'll find *longfilename* as the substring, but will rightly ignore it because it the basename. So to fix this the function should be updated to find the longest substring *surrounded by separators*, becoming a lot less useful for other callers.
However yes, if you can fix the issues I'll try and give it a new shot and propose some changes instead of a totally different approach.
b4n commented on this pull request.
small stuff I noticed first time I went over, if it can be useful. Not a complete review.
+{
+ gchar *prefix, **ptr; + + if (!NZV(strv)) + return NULL; + + if (num == 0) + num = g_strv_length(strv); + + prefix = g_strdup(strv[0]); + + for (gint i = 0; prefix[i]; i++) + { + for (gint j = 1; j < num; j++) + { + gchar *s = strv[j];
`const` would make sense, the string should not be modified
+ if (num == 0) + num = g_strv_length(strv); + + /* sub is the working area where substrings from first are copied to */ + sub = g_malloc(len+1); + lcs = g_strdup(""); + foreach_str(_sub, first) + { + gsize chars_left = len - (_sub - first); + /* No point in continuing if the remainder is too short */ + if (max > chars_left) + break; + for (n_chars = 1; n_chars <= chars_left; n_chars++) + { + /* strlcpy() ftw! */
[`g_strlcpy()`](https://developer.gnome.org/glib/stable/glib-String-Utility-Functions.html#g...) is there for you if you like.
if (prefix[i] == '\0')
+ break; + } + return prefix; +} + +/* * Returns the longest common substring in a list of strings. + * + * The size of the list may be given explicitely, but defaults to @c g_strv_length(strv). + * + * @param strv The list of strings to process. + * @param num The number of strings contained in @a strv. Can be 0 if it's terminated by @c NULL. + * + * @return The common prefix that is part of all strings. + */ +gchar *utils_strv_find_lcs(gchar **strv, size_t num)
this should be static
+/** Transform file names in a list to be shorter.
+ * + * This function takes a list of file names (porbably with absolute paths), and + * transforms the paths such that they are short but still unique. This is intended + * for dialogs which present the file list to the user, where the base name may result + * in duplicates (showing the full path might be inappropriate). + * + * The algorthm strips the common prefix (e-g. the user's home directory) and + * replaces the longest common substring with an ellipsis ("..."). + * + * @param file_names @array{length=num} The list of strings to process. + * @param num The number of strings contained in @a file_names. Can be 0 if it's terminated by @c NULL. + * @return @transfer{full} A newly-allocated array of transformed paths strings, terminated by + @c NULL. Use @c g_strfreev() to free it. + * + * @since 1.31 (API 232)
do we need it in the API right away? I mean, why not, but if there's no real use (and although it seems nice, it is fairly specific still) I'd rather wait a little.
+GEANY_API_SYMBOL
+gchar **utils_strv_shorten_file_list(gchar **file_names, size_t num) +{ + gint i, j; + gchar *prefix, *substring, *name, *sep, **s; + TMTag *tmtag; + gchar **names; + gsize len; + + /* The return value shall have exactly the same size as the input. If the input is a + * GStrv (last element is NULL), the output will follow suit. */ + if (!num) + num = g_strv_length(file_names); + /* Always include a terminating NULL, enables easy freeing with g_strfreev() */ + names = g_new(gchar *, num + 1); + names[num] = 0;
should be `NULL`
kugel- commented on this pull request.
+/** Transform file names in a list to be shorter.
+ * + * This function takes a list of file names (porbably with absolute paths), and + * transforms the paths such that they are short but still unique. This is intended + * for dialogs which present the file list to the user, where the base name may result + * in duplicates (showing the full path might be inappropriate). + * + * The algorthm strips the common prefix (e-g. the user's home directory) and + * replaces the longest common substring with an ellipsis ("..."). + * + * @param file_names @array{length=num} The list of strings to process. + * @param num The number of strings contained in @a file_names. Can be 0 if it's terminated by @c NULL. + * @return @transfer{full} A newly-allocated array of transformed paths strings, terminated by + @c NULL. Use @c g_strfreev() to free it. + * + * @since 1.31 (API 232)
I need this function in peasy right away. I thought I mentioned that, didn't I?
@kugel- pushed 1 commit.
1654042 Refactoring and review comments
I hope the code is clearer now and the bug you mentioned should be fixed
Looking at the table below, is this expected, or is it a bug?
path | current result | what I expected ---- | ---- | ---- `/aaa/bbb/cc/ccccc/ddddd` | `cc/.../ddddd` | `.../ddddd` `/aaa/bbb/xxx/yyy/cc/ccccc/ddddd` | `xxx/yyy/cc/.../ddddd` | `xxx/yyy/.../ddddd`
Also, your code doesn't seem to support Windows path separators. I'm not 100% sure we need it in Geany here, but it would probably be good to have support for it if it's supposed to be a generic utility.
Looking at the table below, is this expected, or is it a bug?
it's debatable. From the plain algorithm, your expectation is right and the result is buggy. But one can argue that the actual result may be less confusing to the user (ellipsizing is clearer when the 3 dots are encosed by words. I've never seen a ellipsizing that cuts of the entire front).
Also, your code doesn't seem to support Windows path separators
What do you mean? I used G_DIR_SEPARATOR everywhere. I didn't test on Windows but it should work.
@b4n friendly ping :-)
(ellipsizing is clearer when the 3 dots are encosed by words. I've never seen a ellipsizing that cuts of the entire front).
I agree with @kugel- here, and would have expected what @b4n showed as "current result" anyway.
@kugel- there still are problems shortening some paths, like e.g. */src/a/app-1.2.3/src/lib/module/source.c* and */src/b/app-2.2.3/src/module/source.c*:
path | current result | my expectation ------|--------------------|------------------ `/src/a/app-1.2.3/src/lib/module/source.c` | `a/app-1.2.3/src/lib/module/source.c` | `a/app-1.2.3/src/lib/.../source.c` `/src/b/app-2.2.3/src/module/source.c` | `b/app-2.2.3/src/module/source.c` | `b/app-2.2.3/src/.../source.c`
path current result my expectation |/src/a/app-1.2.3/src/lib/module/source.c| |a/app-1.2.3/src/lib/module/source.c| |a/app-1.2.3/src/lib/.../source.c| |/src/b/app-2.2.3/src/module/source.c| |b/app-2.2.3/src/module/source.c| |b/app-2.2.3/src/.../source.c|
Damn. What happens is that the code determines `.2.3/src/` to be the longest common substring (longer that the expected `/module/`). Then it reduces it to directory components, giving only `src`. This is shorter than the threshold of 5 chars, so in the end nothing is ellipsized. It works if you change /module/ to /moduleXX/.
Fixing this isn't trivial, and without proper unit tests there is a risk to break previously working cases.
I suggest to accept the behavior for now (and merge) and I'll follow up with a fix + unit tests. But I would hate if this doesn't make it into the release just because of select (edge) cases because the status quo is just plain unusable for me. In fact I've been using this patch at work for ages where I have real-world cases and I never came across issues like this.
Thanks for looking into it again!
Fixing this isn't trivial
That's why I suggested in an earlier comment that the fact `utils_strv_find_lcs()` looks generic is cute, but in practice this function does *not* do what `utils_strv_shorten_file_list()` needs.
and without proper unit tests there is a risk to break previously working cases.
Yeah, we should probably add tests for this. Fortunately, it's probably one of the few things that should be easy enough to add a test for, because `utils.c` don't usually have dependencies on internal Geany state, which means it can likely be tested by a minimal test program linking libgeany.
I suggest to accept the behavior for now (and merge) and I'll follow up with a fix + unit tests. But I would hate if this doesn't make it into the release just because of select (edge) cases because the status quo is just plain unusable for me. In fact I've been using this patch at work for ages where I have real-world cases and I never came across issues like this.
Fair enough; I must admit I specifically crafted the test to fail, looking at the weakness of the implementation :)
This said, note that I have another implementation (see a message above) that doesn't have the issue. It has arguably worse performance, and can probably use some improvement in other area, but I didn't find a bug in its results yet.
b4n requested changes on this pull request.
@@ -1927,15 +1927,23 @@ static void show_goto_popup(GeanyDocument *doc, GPtrArray *tags, gboolean have_b
GdkEventButton *button_event = NULL; TMTag *tmtag; guint i; - + gchar **short_names, **file_names, **p;
`p` is not used anywhere.
@@ -2031,6 +2031,209 @@ gchar **utils_strv_join(gchar **first, gchar **second)
return strv; }
+/* * Returns the common prefix in a list of strings. + * + * The size of the list may be given explicitely, but defaults to @c g_strv_length(strv). + * + * @param strv The list of strings to process. + * @param num The number of strings contained in @a strv. Can be 0 if it's terminated by @c NULL.
GLib APIs generally use `-1` to mean this, and although it has the problem of unsigned here (thus would "bug" on a `G_SIZE_MAX`-long input not `NULL`-terminated) it has the nice property of allowing 0 as a normal value which is often handy because then the `strv` wouldn't be used at all. It's currently private so I don't mind much though.
- @return The common prefix that is part of all strings (maybe empty), or NULL if an empty list
+ * was passed in. + */ +static gchar *utils_strv_find_common_prefix(gchar **strv, gsize num) +{ + gchar *prefix; + + if (!NZV(strv)) + return NULL; + + if (num == 0) + num = g_strv_length(strv); + + prefix = g_strdup(strv[0]); + + for (gint i = 0; prefix[i]; i++)
`guint` or `gsize` here.
- for (gint i = 0; prefix[i]; i++)
+ { + for (gsize j = 1; j < num; j++) + { + gchar *s = strv[j]; + if (s[i] != prefix[i]) + { + /* terminate prefix on first mismatch and return */ + prefix[i] = '\0'; + break; + } + } + if (prefix[i] == '\0') + break; + } + return prefix;
What about avoiding the initial copy and simply do: ```c for (gsize i = 0; strv[0][i]; i++) { for (gsize j = 1; j < num; j++) { if (strv[j][i] != strv[0][i]) { /* return prefix on first mismatch */ return g_strndup(strv[0], i); } } } return g_strdup(strv[0]); ``` ?
+/** Transform file names in a list to be shorter.
+ * + * This function takes a list of file names (porbably with absolute paths), and + * transforms the paths such that they are short but still unique. This is intended + * for dialogs which present the file list to the user, where the base name may result + * in duplicates (showing the full path might be inappropriate). + * + * The algorthm strips the common prefix (e-g. the user's home directory) and + * replaces the longest common substring with an ellipsis ("..."). + * + * @param file_names @array{length=num} The list of strings to process. + * @param num The number of strings contained in @a file_names. Can be 0 if it's terminated by @c NULL. + * @return @transfer{full} A newly-allocated array of transformed paths strings, terminated by + @c NULL. Use @c g_strfreev() to free it. + * + * @since 1.31 (API 232)
OK then
- return lcs;
+} + + +/** Transform file names in a list to be shorter. + * + * This function takes a list of file names (probably with absolute paths), and + * transforms the paths such that they are short but still unique. This is intended + * for dialogs which present the file list to the user, where the base name may result + * in duplicates (showing the full path might be inappropriate). + * + * The algorthm strips the common prefix (e-g. the user's home directory) and + * replaces the longest common substring with an ellipsis ("..."). + * + * @param file_names @array{length=num} The list of strings to process. + * @param num The number of strings contained in @a file_names. Can be 0 if it's terminated by @c NULL.
Same comment for the value meaning "compute it". It's especially problematic here as you annotated `file_names` with `array_length=num`; so this would suggest a caller could do: ```c gsize array_len = 0; gchar **array = g_malloc(sizeof *array * array_len); gchar **short = utils_strv_shorten_file_list(array, array_len); ``` This doesn't seem far-fetched for auto-generated code like Vala, and AFAIK there's nothing telling the GIR consumer it's incorrect. The reason it's a big problem is that [`g_strv_length()` does not allow a `NULL` parameter](https://developer.gnome.org/glib/stable/glib-String-Utility-Functions.html#g...). But I could imagine other reasons why it wouldn't be convenient, like for processing only a part of an already-existing strv: the caller would have to carefully special-case processing only 0 elements not to get unexpected behavior. Same, it can be handy that 0 leads to no access of the array pointer at all, because it's a fairly innocent case of not initializing any members, as the caller can expect there will be no access -- and then again, have to special-case 0.
IMO, it'd be a lot easier to use `-1` as the special value like many other functions around. I'd be fine for it to sill be a `gsize` and then asking for a `(gsize) -1` value for the default.
+ /* The return value shall have exactly the same size as the input. If the input is a + * GStrv (last element is NULL), the output will follow suit. */ + if (!num) + num = g_strv_length(file_names); + /* Always include a terminating NULL, enables easy freeing with g_strfreev() */ + names = g_new0(gchar *, num + 1); + + prefix = utils_strv_find_common_prefix(file_names, num); + /* First: determine the common prefix, that will be stripped. + * Don't strip single-letter prefixes, such as '/' */ + prefix_len = 0; + if (NZV(prefix) && prefix[1]) + { + /* Only strip directory components, include trailing '/' */ + start = strrchr(prefix, G_DIR_SEPARATOR);
As mentioned (not clearly, I agree) in a comment, IIUC this doesn't provide full Windows support, because IIUC Windows allows both `/` and `\` as path separators. I guess we'll most likely get `\`-separated paths, but I'm not sure. Also being an API function suggests it's not specific for a particular subset of paths.
The same comment applies thorough the function.
menu = gtk_menu_new();
+ /* If popup would show multiple files presend a smart file list that allows
@elextr seems to indeed have spotted a small typo :)
kugel- commented on this pull request.
- return lcs;
+} + + +/** Transform file names in a list to be shorter. + * + * This function takes a list of file names (probably with absolute paths), and + * transforms the paths such that they are short but still unique. This is intended + * for dialogs which present the file list to the user, where the base name may result + * in duplicates (showing the full path might be inappropriate). + * + * The algorthm strips the common prefix (e-g. the user's home directory) and + * replaces the longest common substring with an ellipsis ("..."). + * + * @param file_names @array{length=num} The list of strings to process. + * @param num The number of strings contained in @a file_names. Can be 0 if it's terminated by @c NULL.
g_malloc() doesn't accept 0 (returns NULL) so I don't think such 0-sized array (0 elemets and not NULL-terminated) can ever exist.
kugel- commented on this pull request.
@@ -2031,6 +2031,209 @@ gchar **utils_strv_join(gchar **first, gchar **second)
return strv; }
+/* * Returns the common prefix in a list of strings. + * + * The size of the list may be given explicitely, but defaults to @c g_strv_length(strv). + * + * @param strv The list of strings to process. + * @param num The number of strings contained in @a strv. Can be 0 if it's terminated by @c NULL.
I can change if you prefer. But I couldn't find an example in the glib documentation.
kugel- commented on this pull request.
+ /* The return value shall have exactly the same size as the input. If the input is a + * GStrv (last element is NULL), the output will follow suit. */ + if (!num) + num = g_strv_length(file_names); + /* Always include a terminating NULL, enables easy freeing with g_strfreev() */ + names = g_new0(gchar *, num + 1); + + prefix = utils_strv_find_common_prefix(file_names, num); + /* First: determine the common prefix, that will be stripped. + * Don't strip single-letter prefixes, such as '/' */ + prefix_len = 0; + if (NZV(prefix) && prefix[1]) + { + /* Only strip directory components, include trailing '/' */ + start = strrchr(prefix, G_DIR_SEPARATOR);
Meh, what's the point G_DIR_SEPARATOR if you need to handle both. Also, what about legitimate paths that contain the other slash (e.g. a file can contain the \ character on linux, not sure if the same is true for / on Windows).
Are you sure we don't have problems with this elsewhere inside Geany? I.e. is this something we support or not?
b4n commented on this pull request.
- return lcs;
+} + + +/** Transform file names in a list to be shorter. + * + * This function takes a list of file names (probably with absolute paths), and + * transforms the paths such that they are short but still unique. This is intended + * for dialogs which present the file list to the user, where the base name may result + * in duplicates (showing the full path might be inappropriate). + * + * The algorthm strips the common prefix (e-g. the user's home directory) and + * replaces the longest common substring with an ellipsis ("..."). + * + * @param file_names @array{length=num} The list of strings to process. + * @param num The number of strings contained in @a file_names. Can be 0 if it's terminated by @c NULL.
The thing is that even in that case as said it's a problem for this function, as it'll pass on a NULL strv to `g_strv_len()`.
But although `malloc()` returning NULL on 0 length is perfectly normal, it's not my point; the thing is that 0 is a perfectly valid, and not that unlikely value for a meaningful length (see my other examples).
b4n commented on this pull request.
@@ -2031,6 +2031,209 @@ gchar **utils_strv_join(gchar **first, gchar **second)
return strv; }
+/* * Returns the common prefix in a list of strings. + * + * The size of the list may be given explicitely, but defaults to @c g_strv_length(strv). + * + * @param strv The list of strings to process. + * @param num The number of strings contained in @a strv. Can be 0 if it's terminated by @c NULL.
I'm not sure there is one for strvs, but there are quite some for strings, the first that comes to mind right now is [`g_utf8_strlen()`](https://developer.gnome.org/glib/stable/glib-Unicode-Manipulation.html#g-utf...)
b4n commented on this pull request.
+ /* The return value shall have exactly the same size as the input. If the input is a + * GStrv (last element is NULL), the output will follow suit. */ + if (!num) + num = g_strv_length(file_names); + /* Always include a terminating NULL, enables easy freeing with g_strfreev() */ + names = g_new0(gchar *, num + 1); + + prefix = utils_strv_find_common_prefix(file_names, num); + /* First: determine the common prefix, that will be stripped. + * Don't strip single-letter prefixes, such as '/' */ + prefix_len = 0; + if (NZV(prefix) && prefix[1]) + { + /* Only strip directory components, include trailing '/' */ + start = strrchr(prefix, G_DIR_SEPARATOR);
e.g. a file can contain the \ character on linux
Yes, handling of `` has to be special-cased for Windows
not sure if the same is true for / on Windows
I don't think so, IIUC both `` and `/` are separators. One simple non-optimized way to match it can be `"/" G_DIR_SEPARATOR_S` (e.g. preprocessor concatenation of either `"//"` or `"/"`) together with things like `strpbrk()`.
Are you sure we don't have problems with this elsewhere inside Geany? I.e. is this something we support or not?
I'm not entirely sure, but I think so. However, we're probably normalizing the path at some point, which means we might not encounter the issue with paths we handle anyway. So this might not be so important in the end.
kugel- commented on this pull request.
- return lcs;
+} + + +/** Transform file names in a list to be shorter. + * + * This function takes a list of file names (probably with absolute paths), and + * transforms the paths such that they are short but still unique. This is intended + * for dialogs which present the file list to the user, where the base name may result + * in duplicates (showing the full path might be inappropriate). + * + * The algorthm strips the common prefix (e-g. the user's home directory) and + * replaces the longest common substring with an ellipsis ("..."). + * + * @param file_names @array{length=num} The list of strings to process. + * @param num The number of strings contained in @a file_names. Can be 0 if it's terminated by @c NULL.
I can change it, no problem. FWIW: I'd argue that passing NULL should be an error (might add an g_return_val_if_fail() since it's an API function).
b4n commented on this pull request.
- return lcs;
+} + + +/** Transform file names in a list to be shorter. + * + * This function takes a list of file names (probably with absolute paths), and + * transforms the paths such that they are short but still unique. This is intended + * for dialogs which present the file list to the user, where the base name may result + * in duplicates (showing the full path might be inappropriate). + * + * The algorthm strips the common prefix (e-g. the user's home directory) and + * replaces the longest common substring with an ellipsis ("..."). + * + * @param file_names @array{length=num} The list of strings to process. + * @param num The number of strings contained in @a file_names. Can be 0 if it's terminated by @c NULL.
IMO it should not be an error if you allow a non-`NULL`-terminated array, because as you said yourself, allocating a 0-length array will give you a `NULL` pointer -- which is fine as you shouldn't be accessing anything past its 0th element.
codebrainz commented on this pull request.
- return lcs;
+} + + +/** Transform file names in a list to be shorter. + * + * This function takes a list of file names (probably with absolute paths), and + * transforms the paths such that they are short but still unique. This is intended + * for dialogs which present the file list to the user, where the base name may result + * in duplicates (showing the full path might be inappropriate). + * + * The algorthm strips the common prefix (e-g. the user's home directory) and + * replaces the longest common substring with an ellipsis ("..."). + * + * @param file_names @array{length=num} The list of strings to process. + * @param num The number of strings contained in @a file_names. Can be 0 if it's terminated by @c NULL.
Probably a stupid question (I didn't look at the calling code), but couldn't it use a ready-made collection type like GArray or GPtrArray instead of passing separate arguments? If the parameter isn't modified and the caller does have separate pointer and length, they could just stuff them in a GArray on the stack and pass by const pointer. Or the function could mutate the array in place for better performance. Would also allow to deduplicate the input array without handing back an array that has to be counted by walking through it. Seems cleaner and more efficient than using a plain C array and separate length, but maybe I'm brainwashed by C++ by now, so feel free to ignore :)
kugel- commented on this pull request.
- return lcs;
+} + + +/** Transform file names in a list to be shorter. + * + * This function takes a list of file names (probably with absolute paths), and + * transforms the paths such that they are short but still unique. This is intended + * for dialogs which present the file list to the user, where the base name may result + * in duplicates (showing the full path might be inappropriate). + * + * The algorthm strips the common prefix (e-g. the user's home directory) and + * replaces the longest common substring with an ellipsis ("..."). + * + * @param file_names @array{length=num} The list of strings to process. + * @param num The number of strings contained in @a file_names. Can be 0 if it's terminated by @c NULL.
strv is very gir friendly, and since the function would decompose a GPtrArray to handle its raw strv it might as well get the strv directly.
kugel- commented on this pull request.
+ /* The return value shall have exactly the same size as the input. If the input is a + * GStrv (last element is NULL), the output will follow suit. */ + if (!num) + num = g_strv_length(file_names); + /* Always include a terminating NULL, enables easy freeing with g_strfreev() */ + names = g_new0(gchar *, num + 1); + + prefix = utils_strv_find_common_prefix(file_names, num); + /* First: determine the common prefix, that will be stripped. + * Don't strip single-letter prefixes, such as '/' */ + prefix_len = 0; + if (NZV(prefix) && prefix[1]) + { + /* Only strip directory components, include trailing '/' */ + start = strrchr(prefix, G_DIR_SEPARATOR);
So this might not be so important in the end.
Ok, so I don't plan changes for now in this regard. Speak up if you chnage your mind.
@kugel- pushed 1 commit.
0789cf2 Changes for review comments
b4n requested changes on this pull request.
Still annoying :)
Otherwise looks good.
You also should resolve the conflict, but apart from that I think we're good do go.
@@ -2031,6 +2031,208 @@ gchar **utils_strv_join(gchar **first, gchar **second)
return strv; }
+/* * Returns the common prefix in a list of strings. + * + * The size of the list may be given explicitely, but defaults to @c g_strv_length(strv). + * + * @param strv The list of strings to process. + * @param num The number of strings contained in @a strv. Can be -1 if it's terminated by @c NULL. + * + * @return The common prefix that is part of all strings (maybe empty), or NULL if an empty list + * was passed in. + */ +static gchar *utils_strv_find_common_prefix(gchar **strv, gssize num) +{ + gchar *prefix;
unused now
``` (geany:25173): Geany-CRITICAL **: utils_strv_shorten_file_list: assertion 'num != 0 && file_names == NULL' failed Erreur de segmentation ``` oops! Just tried to go to a location
b4n commented on this pull request.
- @return @transfer{full} A newly-allocated array of transformed paths strings, terminated by
+ @c NULL. Use @c g_strfreev() to free it. + * + * @since 1.31 (API 232) + */ +GEANY_API_SYMBOL +gchar **utils_strv_shorten_file_list(gchar **file_names, gssize num) +{ + gsize i; + gchar *prefix, *substring, *lcs; + gchar *start, *end; + gchar **names; + gsize prefix_len, lcs_len; + + /* We don't dereference file_names if num == 0. */ + g_return_val_if_fail(num != 0 && file_names == NULL, NULL);
This should be `num == 0 || file_names != NULL`, as it's basically an assertion, and should list conditions expected to pass.
@kugel- just in case you didn't see my review here like on the other PR.
@kugel- pushed 1 commit.
c599a1e fix inverted logic in assertion
I rebased and imply removed the API bump that was causing the conflict. I hope that's what you wanted?
@kugel- See https://github.com/kugel-/geany/pull/6
@b4n are we good to go?
Merged #1445 into master.
Yep, thanks :)
Awesome, thanks!
github-comments@lists.geany.org