My Geany crashes on the following multiline regexp (while searching in find/replace dialogue): ^(\d+);(\d+);(\d+);SCT\n(.*?\n)*?\1;47429007;\3
Geany version 1.30.1 (built from source on 2017-06-25 with GTK 2.24.30, GLib 2.48.2
Doesn't crash for me with 1.32 built earlier this month. If it's not a fixed bug, maybe it depends on the contents of the file?
You're right, it depends on the content of the file.
On Thursday, 24 August 2017, 12:14, Matthew Brush notifications@github.com wrote:
Doesn't crash for me with 1.32 built earlier this month. If it's not a fixed bug, maybe it depends on the contents of the file?— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
@Chayyoo then you need to post a gist with a small file where it does happen because, as @codebrainz said, we can't reproduce it with any of our files.
The file consists of sorted lines, each with three numbers separated by a semicolon and followed by two or three capital letters. The first number is repeated on several consecutive lines, typically 1 .. 8 times. Here's an extract:10000006;116680003;29857009;SCT 10000006;116680003;9972008;SCT 10000006;363698007;51185008;SCT 10000006;47429007;22253000;BT 10001005;116676008;409774005;SCT 10001005;116680003;87628006;SCT 10001005;116680003;91302008;SCT 10001005;246075003;409822003;SCT 10001005;370135005;441862004;SCT 10001005;47429007;23583003;BT 10002003;116680003;116175006;SCT 10002003;260507000;309795001;SCT 10002003;260686004;129304002;SCT 10002003;405813007;414003;SCT 10003008;116680003;106234000;SCTThe original file had about a million such lines. I have now discovered that the crash on my PC occurs from somewhere between 5300 and 5400 lines onwards. Below 5300 lines the regex works as expected.To narrow things down, the following regex does NOT crash, not even on a miliion lines:^(\d+?);(\d+?);(\d+?);[BCST]+\n(.*?\n)\1But this one does:^(\d+?);(\d+?);(\d+?);[BCST]+\n(.*?\n)*\1
On Friday, 25 August 2017, 10:34, elextr notifications@github.com wrote:
@Chayyoo then you need to post a gist with a small file where it does happen because, as @codebrainz said, we can't reproduce it with any of our files.— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
@Chayyoo that doesn't really help to track it down. Can you run Geany under a debugger and post a backtrace after the crash (hope you built it with -g :)
I ran Geany from the command line and it crashes on the regex with "segmentation fault (memory dump made)". I have gdb on my system but I've never used it. Can you give me a command to do what you want (something like gdb geany .... I presume)?
On Monday, 28 August 2017, 12:52, elextr notifications@github.com wrote:
@Chayyoo that doesn't really help to track it down. Can you run Geany under a debugger and post a backtrace after the crash (hope you built it with -g :)— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Can you give me a command to do what you want (something like gdb geany .... I presume)?
See here under "Getting a backtrace": http://www.geany.org/Support/Bugs
Note that you may need to keep pressing return to get the whole backtrace.
When I run "gdb geany" Geany doesn't seem to start (no window). Gdb just says:Reading symbols from geany...done. (gdb) and it stays that way no matter how many returns I press....I also tried to start Geany the regular way and then couple gdb to the process number with gdb -p 14825, but then the Geany screen becomes blank.As I said, I'm totally unfamiliar with gdb.
On Tuesday, 29 August 2017, 2:21, elextr notifications@github.com wrote:
Note that you may need to keep pressing return to get the whole backtrace.— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
As the instructions linked above say, type "run -v" at the gdb prompt.
I attach the output of strace (last part) and ltrace (last lines of the file) after the crash. Can that help?I also verified that the crash is not due to a particular pattern in the input file between lines 5500 (where it doesn't crash) and 5600 (where it crashes). Apparently, only size matters.
On Tuesday, 29 August 2017, 9:32, Kris Van Bruwaene krvbr@yahoo.co.uk wrote:
When I run "gdb geany" Geany doesn't seem to start (no window). Gdb just says:Reading symbols from geany...done. (gdb) and it stays that way no matter how many returns I press....I also tried to start Geany the regular way and then couple gdb to the process number with gdb -p 14825, but then the Geany screen becomes blank.As I said, I'm totally unfamiliar with gdb.
On Tuesday, 29 August 2017, 2:21, elextr notifications@github.com wrote:
Note that you may need to keep pressing return to get the whole backtrace.— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
I attach the output of strace (last part) and ltrace (last lines of the file) after the crash. Can that help?I also verified that the crash is not due to a particular pattern in the input file between lines 5500 (where it doesn't crash) and 5600 (where it crashes). Apparently, only size matters.
Actually you didn't attach it or github ignored the attachments, but that probably won't help so don't waste time trying another way, just follow the instructions to run gdb and get the backtrace:
``` gdb geany ``` at the `(gdb)` prompt: ``` run -v ``` do whatever crashes it and after it crashes and returns to the gdb prompt: ``` bt ``` and return while it says `return to continue`
Paste the output.
Here's the end of it:#16362 0x00007ffff12a751c in ?? () from /lib/x86_64-linux-gnu/libpcre.so.3 #16363 0x00007ffff12b15f0 in ?? () from /lib/x86_64-linux-gnu/libpcre.so.3 #16364 0x00007ffff12a751c in ?? () from /lib/x86_64-linux-gnu/libpcre.so.3 #16365 0x00007ffff12b6903 in pcre_exec () from /lib/x86_64-linux-gnu/libpcre.so.3 #16366 0x00007ffff53fc32b in g_match_info_next () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #16367 0x00007ffff53fdbcf in g_regex_match_full () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #16368 0x00007ffff79b2f97 in find_regex (sci=sci@entry=0xa68400, pos=pos@entry=0, regex=regex@entry=0xa87920, multiline=multiline@entry=16, match=match@entry=0xda9f40) at search.c:1944 #16369 0x00007ffff79b6f84 in search_find_next (sci=0xa68400, str=str@entry=0xdac470 "^(\d+?);(\d+?);(\d+?);[BCST]+\n(.*?\n)*\1", flags=flags@entry=(GEANY_FIND_REGEXP | GEANY_FIND_MULTILINE), match_=match_@entry=0x0) at search.c:2044 #16370 0x00007ffff797b17e in document_find_text (doc=doc@entry=0xbb5560, text=text@entry=0xdac470 "^(\d+?);(\d+?);(\d+?);[BCST]+\n(.*?\n)*\1", original_text=original_text@entry=0xda6660 "^(\d+?);(\d+?);(\d+?);[BCST]+\n(.*?\n)*\1", flags=flags@entry=(GEANY_FIND_REGEXP | GEANY_FIND_MULTILINE), search_backwards=search_backwards@entry=0, match_=match_@entry=0x0, scroll=1, parent=0x73b9b0) at document.c:2345 #16371 0x00007ffff79b33e0 in on_replace_dialog_response (dialog=0x73b9b0, response=1, user_data=<optimized out>) at search.c:1509 #16372 0x00007ffff58c7518 in g_cclosure_marshal_VOID__ENUMv () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #16373 0x00007ffff58c51d4 in ?? () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #16374 0x00007ffff58df9a6 in g_signal_emit_valist () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #16375 0x00007ffff58e008f in g_signal_emit () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #16376 0x00007ffff58c51d4 in ?? () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 ---Type <return> to continue, or q <return> to quit--- #16377 0x00007ffff58df9a6 in g_signal_emit_valist () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #16378 0x00007ffff58e008f in g_signal_emit () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #16379 0x00007ffff6d85f35 in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0 #16380 0x00007ffff58c4fa5 in g_closure_invoke () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #16381 0x00007ffff58d6afc in ?? () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #16382 0x00007ffff58dfd5c in g_signal_emit_valist () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #16383 0x00007ffff58e008f in g_signal_emit () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #16384 0x00007ffff6d84e79 in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0 #16385 0x00007ffff6e2baec in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0 #16386 0x00007ffff58c4fa5 in g_closure_invoke () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #16387 0x00007ffff58d756e in ?? () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #16388 0x00007ffff58df7f9 in g_signal_emit_valist () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #16389 0x00007ffff58e008f in g_signal_emit () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #16390 0x00007ffff6f4393c in ?? () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0 #16391 0x00007ffff6e2a284 in gtk_propagate_event () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0 #16392 0x00007ffff6e2a63b in gtk_main_do_event () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0 #16393 0x00007ffff6a9ec8c in ?? () from /usr/lib/x86_64-linux-gnu/libgdk-x11-2.0.so.0 #16394 0x00007ffff53ea197 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #16395 0x00007ffff53ea3f0 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #16396 0x00007ffff53ea712 in g_main_loop_run () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #16397 0x00007ffff6e29697 in gtk_main () from /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0 #16398 0x00007ffff799f527 in main_lib (argc=1, argv=0x7fffffffdf68) at libmain.c:1233 #16399 0x00007ffff7364830 in __libc_start_main (main=0x4005a0 <main>, argc=2, argv=0x7fffffffdf68, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdf58) at ../csu/libc-start.c:291 #16400 0x00000000004005d9 in _start () Will that do?
On Wednesday, 30 August 2017, 15:23, elextr notifications@github.com wrote:
I attach the output of strace (last part) and ltrace (last lines of the file) after the crash. Can that help?I also verified that the crash is not due to a particular pattern in the input file between lines 5500 (where it doesn't crash) and 5600 (where it crashes). Apparently, only size matters. Actually you didn't attach it or github ignored the attachments, but that probably won't help so don't waste time trying another way, just follow the instructions to run gdb and get the backtrace:gdb geany at the (gdb) prompt:run -v do whatever crashes it and after it crashes and returns to the gdb prompt:bt and return while it says return to continuePaste the output.— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Ok, that shows that the crash is inside the PCRE library that Glib uses for regular expression handling. But what is the actual crash? Is it a segmentation violation? If it depends on size, are you running out of memory?
It's a segmentation fault ("core dumped"). How can I tell if that's an out of memory?
On Wednesday, 30 August 2017, 16:10, elextr notifications@github.com wrote:
Ok, that shows that the crash is inside the PCRE library that Glib uses for regular expression handling. But what is the actual crash? Is it a segmentation violation? If it depends on size, are you running out of memory?— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
I can reproduce the issue with your sample repeated until it reaches 1010445 lines, and with the pattern `^(\d+?);(\d+?);(\d+?);[BCST]+\n(.*?\n)*\1`. The crash seems to be due to an excessive recursion (leading to stack overflow) inside libpcre, on which we have no control. However, your pattern is likely to do this, as you allow for any number of optional lines between two identical lines; this basically means that the regex can capture the whole file, and has to be matched against the whole file if there is no match. It's sad to ever find a program crashing, but in this case there isn't much we can do, and your regular expression is fairly dangerous per se, in term of performance and memory usage at the very least -- and most distros' libpcre are built to use recursion because it's faster (or so they say) but can lead to unavoidable crashes on extreme cases like this.
BTW, I'm not sure it's what you want, but if you choose to be ungreedy on the number of lines allowed between the start and end, it is a lot less likely to crash (`^(\d+?);(\d+?);(\d+?);[BCST]+\n(.*?\n)*?\1`) -- but it sill will if there is no match for one of the first lines.
IMO, this is a case of "wontfix", both because we can't do anything about it, and it's caused by a pathological match -- which is exactly what is asked for.
Closed #1586.
github-comments@lists.geany.org