Indentation using regex (was [PATCH 14/19] Rewrite tab switching queue)

> Hi,
> I know this is a more than one year old discussion, but I was messing
> with a very similar thing a few minutes ago when wanting to add
> autoindent support for SH.
> Le 16/09/2010 22:17, Jiří Techet a écrit :
>> On Thu, Sep 16, 2010 at 19:27, Thomas Martitz
>> <thomas.martitz at student.htw-berlin.de> wrote:
>>>  On 16.09.2010 02:23, Lex Trotman wrote:
>>>> Hi Jiri,
>>>> I couldn't get this to work at all, it printed "calling indent this
>>>> line" all the time but didn't indent :-(
>>>> I only had half an hour so I couldn't investigate much.
>>> I have the same experience. Auto-indentation doesn't seem to work anymore
>>> (e.g. when hitting enter after on a line that ends with {, or when typing
>>> }).
>> I have just re-tested it again and it works on my machine (I have
>> forgotten one trace in the code - that's what you see in the console).
>> A quick question: have you read the commit log?
>>     This patch makes it possible to specify several regex patterns for every
>>     filetype which determine under what condition the indentation is performed.
>>     The pattern variables are specified under the [settings] section of the
>>     given filetype and their value is the regex to be used. The variables are
>>     as follows:
>>     * indent_this_line_regex - the match is performed after every keystroke
>>       and if the regex matches, the indentation is performed on the current
>>       line
>>     * indent_next_line_regex - the match is performed only when enter is
>>       pressed. The indentation is applied on the next line
>>     * unindent_this_line_regex - like indent_this_line_regex but
>> unindents instead
>>     * unindent_next_line_regex - like indent_next_line_regex but indents instead
>>     Comments and strings are detected from the lexer so these can be ignored
>>     inside the patterns. For instance these are very basic rules for GNU
>>     indent style:
>>     indent_next_line_regex=^.*\\{[[:blank:]]*$
>>     unindent_this_line_regex=^[[:blank:]]*\\}$
>>     indent_this_line_regex=^[[:blank:]]+\\{$
>>     unindent_next_line_regex=^[[:blank:]]*\\}[[:blank:]]*$
>>     By commenting-out the last two lines you get ANSI indentation style.
>>     If you replace \\{ and \\} with begin and end, respectively, you
>>     get analogous rules for pascal. Notice the double-escaping of { and } -
>>     the first escape sequence is for the keyfile ini format (so for the
>>     regex itself \\ becomes \).
>> This means that in order to make it work e.g. for C, you have to edit
>> ~/.config/geany/filedefs/filetypes.c
>> (or the corresponding file under /usr/local/share/geany) and add
>> indent_next_line_regex=^.*\\{[[:blank:]]*$
>> unindent_this_line_regex=^[[:blank:]]*\\}$
>> indent_this_line_regex=^[[:blank:]]+\\{$
>> unindent_next_line_regex=^[[:blank:]]*\\}[[:blank:]]*$
>> under the [settings] section (+ restart geany). Please let me know if
>> it works (but also in the opposite case ;-).
> First, note that I wasn't able to find the patch, so I'm only guessing
> from reading the thread and from my own (much less complete) attempt.

It doesn't apply to current trunk anyway - I've discarded it because
it would need major update but more importantly, there was one
limitation I wanted to address.

The problem was that a single line could be indented several times
with single regular expression. Imagine a code like

if (foo)
....if (bar)

Now imagine that you put cursor just behind the } following baz(); and
press backspace to delete it. Now if you press } at this position
again, it gets unindented again so you get

if (foo)
....if (bar)

>From the editors I tested, it seems they remember for which line
indentation was performed and don't perform it again. This means to
have some per-line flag which indicates this state (e.g.
NOT_INDENTABLE, INDENTABLE). When a file is loaded, all non-empty
lines would be marked as NOT_INDENTABLE and every empty line as
INDENTABLE (the newly created empty lines too). A line would stay in
INDENTABLE state even after you start typing some text. INDENTABLE
state would change to NOT_INDENTABLE under each of the following

1. The line matches a regex - in this case is indented and shouldn't
be indented any more
2. The line is not empty, matches a regex and enter is pressed (you
have finished editing the line and don't want it to be indented when
you modify it later)

Now this approach requires some per-line data to be stored - is it
possible with Scintilla?

> So.  This looks pretty good for line-based indents (but not brace match
> :(), but I ran into a really annoying problem with SH:
> SH should indent after "then", "do", etc. and unindent "fi", "esac",
> etc.  The problem is that you expect the "fi" line to be unindented
> (e.g. use unindent_this_line), but if you type "file" for example, it'd
> wrongly unindent that line too!
> I thought about unindenting the previous line when entering the \n, but
> this isn't a real solution either since re-adding a newline after a well
> indented line would unindent it again.  Crap.
> So I haven't yet found a sensible solution for this problem -- which
> wouldn't apply for '}' since it's very unlikely it's part of a bigger
> word -- and would like to know it anybody got super clever ideas, or how
> other editors you know handle this.

So I've just tried TextMate and it has failed in the same way.

But I think it could be solved by introducing one more flag -
INDENTED. When you are on an INDENTABLE line and it matches a regex,
it would be indented and its flag would be changed to INDENTED. Now
the difference is that for INDENTED lines the regex would be
re-evaluated every time you type a letter and if it stops matching,
the indent would be undone and INDENTED state would be changed back to
INDENTABLE. For completeness, I think also UNINDENTED state would have
to be introduced to perform the right undo action for the unindent

> This said, I really like the idea of configurable indentation rules that
> could handle languages like SH, Pascal, Ruby, Ada, etc. without the need
> to hard-code it.

Would be great if someone has time to implement it ;-).


