[Geany-Users] Regular expression, for Unicode characters

James Ginns starvagrant at xxxxx
Tue Aug 2 09:03:16 UTC 2016


Regular Expressions are a tad difficult to master.

Basic question: you're using lazy modifiers on purpose right? Just 
checking.

So, a dissection The regex engine (don't know what you're using) should 
hit \W*? and look for as few non word characters as possible (in some 
instances zero). Then it will look for ONE character in the character 
class [p{Lu}] (unicode?). Then it will look for zero or more instances 
of [p{Lu}] or a non-word character. This is until it gets to the closing 
tag. Since you're only looking for a single capital letter, why not try:

<p(>.*?[[p{Lu}]].*?</p>)

Or better yet, since you're only replacing the p tag with p class="bold" 
why not just capture the initial p tag:

(<p>).*?[[p{Lu}]].*?</p>

Hope that gives you some starting ideas.

On 07/31/2016 08:19 AM, Vesta wrote:
> Can anyone show how should look regular expression for this particular case?
>
> this not works too:
>
> <p(>\W*?[[p{Lu}]][[p{Lu}]\W]*?</p>)
>
> Regards,
> Vesta
>
>
>
>
>
>> Sent: Sunday, July 31, 2016 at 3:32 PM
>> From: "Lex Trotman" <elextr at gmail.com>
>> To: "Geany general discussion list" <users at lists.geany.org>
>> Subject: Re: [Geany-Users] Regular expression, for Unicode characters
>>
>> Geany uses the Glib regex library whose syntax is described at
>> https://developer.gnome.org/glib/stable/glib-regex-syntax.html
>>
>> Cheers
>> Lex
>>
>> 2016-07-31 22:03 GMT+10:00 Vesta <laguna-mc at mail.com>:
>>> How to create regular expression tp match all UPPER CASE text within paragraps tag, and replace these <p> tag with <p class="bold">
>>>
>>>      <p>                                                   </p>
>>>      <p>                      USU EA EUISMOD HONESTATIS DETERRUISSET.</p>
>>>      <p>Qualisque mnesarchum no nam, usu cu fastidii delicata. Eu mei nonumy libris, quas movet vivendo vim at. Prima epicuri conceptam pro ad, in suas nonumes similique duo. Qui mundi essent complectitur eu. Ei laudem veritus democritum vis, te ferri appareat eos. Ceteros pertinacia ea eum, quo integre theophrastus ex, eum et sint omnes detracto. </p>
>>>      <p>Usu ea euismod honestatis deterruisset. Ne quo malis meliore, duo viris liberavisse no, mea an vide mutat quodsi. Vis an vidit debitis, et noster aliquam pri, case iudicabit te sea. </p>
>>>      <p>                                                                             </p>
>>>      <p>                       CU CONGUE IRIURE SCAEVOLA   --
>>>         UT DOMING IRACUNDIA. </p>
>>>      <p>                                  DICO TEMPOR HABEMUS - PART II, 123 </p>
>>>      <p>Homero everti ei nam. An liber euripidis vis, pericula persecuti deseruisse ad mea. Dicant offendit sea et, per esse timeam deserunt ut. In pri enim sadipscing, ei movet soleat suavitate vim. Mea et omnesque phaedrum, paulo luptatum concludaturque vim ea. -- LIBER. </p>
>>>
>>> I want appply class to
>>>
>>> <p class="bold">                      USU EA EUISMOD HONESTATIS DETERRUISSET.</p>
>>> <p class="bold">                      CU CONGUE IRIURE SCAEVOLA   --
>>>         UT DOMING IRACUNDIA. </p>
>>> <p class="bold">                                DICO TEMPOR HABEMUS -PART II, 123 </p>
>>>
>>> I need Unicode solution for Cyrillic text. This not works:
>>>
>>> Find what: <p(>\W*?[[:upper:]][[:upper:]\W]*?</p>)
>>> Replace with: <p class="bold"\1
>>> _______________________________________________
>>> Users mailing list
>>> Users at lists.geany.org
>>> https://lists.geany.org/cgi-bin/mailman/listinfo/users
>> _______________________________________________
>> Users mailing list
>> Users at lists.geany.org
>> https://lists.geany.org/cgi-bin/mailman/listinfo/users
>>
> _______________________________________________
> Users mailing list
> Users at lists.geany.org
> https://lists.geany.org/cgi-bin/mailman/listinfo/users



More information about the Users mailing list