Re: [Geany-devel] Indentation using regex (was [PATCH 14/19] Rewrite tab switching queue)

6 Dec 2011

      On Tue, Dec 6, 2011 at 07:20, Lex Trotman elextr@gmail.com wrote:
...
[...]
...
First, note that I wasn't able to find the patch, so I'm only guessing
from reading the thread and from my own (much less complete) attempt.
I'm afraid that if I had the patch it is on my broken hard drive :-S
And anyway we never got it to work satisfactorily.
...
So.  This looks pretty good for line-based indents (but not brace match
:(), but I ran into a really annoying problem with SH:
SH should indent after "then", "do", etc. and unindent "fi", "esac",
etc.  The problem is that you expect the "fi" line to be unindented
(e.g. use unindent_this_line), but if you type "file" for example, it'd
wrongly unindent that line too!
I thought about unindenting the previous line when entering the \n, but
this isn't a real solution either since re-adding a newline after a well
indented line would unindent it again.  Crap.
So I haven't yet found a sensible solution for this problem -- which
wouldn't apply for '}' since it's very unlikely it's part of a bigger
word -- and would like to know it anybody got super clever ideas, or how
other editors you know handle this.
This said, I really like the idea of configurable indentation rules that
could handle languages like SH, Pascal, Ruby, Ada, etc. without the need
to hard-code it.
WARNING, complex topic, big post :)
Quick summary of ones I know:
Emacs has language specific elisp, for C:
"It analyzes the line to determine its syntactic symbol(s) (the kind
of language construct it's looking at) and its anchor position (the
position earlier in the file that CC Mode will indent the line
relative to). The anchor position might be the location of an opening
brace in the previous line, for example. See Syntactic Analysis.
It looks up the syntactic symbol(s) in the configuration to get the
corresponding offset(s). The symbol +, which means “indent this line
one more level” is a typical offset. CC Mode then applies these
offset(s) to the anchor position, giving the indentation for the line.
The different sorts of offsets are described in c-offsets-alist. "
And it admits that even then it gets it wrong sometimes :(
Eclipse and Netbeans also use parser results for the indent guidance.
I don't think parsing the source for indent guidance is in the Geany
light and small spirit, so I reject that.
Instead I propose the following "correct most of the time" but simple
option based on a combination of Jiri's and Emacs' methods:

Each line N has an initial indentation which is the indentation of

line N-1 plus the increments/decrements for all matches to "indent
next line" regexes that occur in liine N-1.  (Note that each regex has
a signed count of columns to indent/exdent)
Maybe I don't understand it correctly but does this mean that if you
open an existing file, you'd re-indent it completely based on the
regexes? I don't think this is a good idea because this could lead to
whitespace change in every line when you edit just a single line.
Or does it mean to have these indent numbers just internally and use
them only when when auto-indentation is done? I often work with files
edited by many people over many years which have inconsistent indents.
Imagine the correct indent size is 4 but someone used just 2 indents
in the outer "if" block. If I insert new "if" inside this block, the
indent size will be 6 because of the incorrect outer indent. This is
exactly why I used the "delta" indent solution to be locally correct
and have minimal impact on (and be minimally affected by) the rest of
the code.
One more thing - with global indents you have to be sure that the
regexes catch all the indentation cases (without false positives)
otherwise one error will affect the indentation everywhere in the rest
of the file. You can do crazy stuff with some languages so I can
imagine such a thing can happen easily (single line with end of
multi-line comment followed by end block followed by another comment).
With delta indentation it's much less critical - the indent may be
incorrect for the next line but this won't affect the rest of the file
in any negative way. Moreover, you usually don't do things like the
comment example when you write the code and when you need
auto-indentation; you usually add them afterwards when no
autoindentation is needed.
Final remark - better not to auto-indent at all than to indent
incorrectly. There's nothing worse than an editor (actually anything)
which tries to be smart in an annoying way.
Cheers,
Jiri

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

Re: [Geany-devel] Indentation using regex (was [PATCH 14/19] Rewrite tab switching queue)