Lexilla is C++, and its been my experience that nearly all of a C++ codebase gets paged into the working set because of constructors being spread through the code and being called when global objects are being created, objects like lexers.
I haven't examined the Linux shared library closely but on Windows I have checked the map and also used the debugger to see how memory is used.
Lexilla doesn't create lexers itself at load time, it creates `LexerModule` objects which are simple and similar to C structs. There are simple initializers which means that the `LexerModule`s won't be in a read-only segment. The initialization code for all the lexer modules should be collected into one initialization segment or similar (likely `.init_array` for Linux) so that relatively few pages are loaded at startup. Put break points on the 2 `LexerModule` constructors and check down-stack for the calling code.
Other module level data in lexers is mostly simpler and initialized without any C++ constructors called.
If an application asks for many lexers to be created at startup (perhaps to access metadata) then there will be more overhead but that is the application's choice.