How to create regular expression tp match all UPPER CASE text within paragraps tag, and replace these <p> tag with <p class="bold">
<p> </p> <p> USU EA EUISMOD HONESTATIS DETERRUISSET.</p> <p>Qualisque mnesarchum no nam, usu cu fastidii delicata. Eu mei nonumy libris, quas movet vivendo vim at. Prima epicuri conceptam pro ad, in suas nonumes similique duo. Qui mundi essent complectitur eu. Ei laudem veritus democritum vis, te ferri appareat eos. Ceteros pertinacia ea eum, quo integre theophrastus ex, eum et sint omnes detracto. </p> <p>Usu ea euismod honestatis deterruisset. Ne quo malis meliore, duo viris liberavisse no, mea an vide mutat quodsi. Vis an vidit debitis, et noster aliquam pri, case iudicabit te sea. </p> <p> </p> <p> CU CONGUE IRIURE SCAEVOLA -- UT DOMING IRACUNDIA. </p> <p> DICO TEMPOR HABEMUS - PART II, 123 </p> <p>Homero everti ei nam. An liber euripidis vis, pericula persecuti deseruisse ad mea. Dicant offendit sea et, per esse timeam deserunt ut. In pri enim sadipscing, ei movet soleat suavitate vim. Mea et omnesque phaedrum, paulo luptatum concludaturque vim ea. -- LIBER. </p>
I want appply class to
<p class="bold"> USU EA EUISMOD HONESTATIS DETERRUISSET.</p> <p class="bold"> CU CONGUE IRIURE SCAEVOLA -- UT DOMING IRACUNDIA. </p> <p class="bold"> DICO TEMPOR HABEMUS -PART II, 123 </p>
I need Unicode solution for Cyrillic text. This not works:
Find what: <p(>\W*?[[:upper:]][[:upper:]\W]*?</p>) Replace with: <p class="bold"\1
Geany uses the Glib regex library whose syntax is described at https://developer.gnome.org/glib/stable/glib-regex-syntax.html
Cheers Lex
2016-07-31 22:03 GMT+10:00 Vesta laguna-mc@mail.com:
How to create regular expression tp match all UPPER CASE text within paragraps tag, and replace these <p> tag with <p class="bold">
<p> </p> <p> USU EA EUISMOD HONESTATIS DETERRUISSET.</p> <p>Qualisque mnesarchum no nam, usu cu fastidii delicata. Eu mei nonumy libris, quas movet vivendo vim at. Prima epicuri conceptam pro ad, in suas nonumes similique duo. Qui mundi essent complectitur eu. Ei laudem veritus democritum vis, te ferri appareat eos. Ceteros pertinacia ea eum, quo integre theophrastus ex, eum et sint omnes detracto. </p> <p>Usu ea euismod honestatis deterruisset. Ne quo malis meliore, duo viris liberavisse no, mea an vide mutat quodsi. Vis an vidit debitis, et noster aliquam pri, case iudicabit te sea. </p> <p> </p> <p> CU CONGUE IRIURE SCAEVOLA -- UT DOMING IRACUNDIA. </p> <p> DICO TEMPOR HABEMUS - PART II, 123 </p> <p>Homero everti ei nam. An liber euripidis vis, pericula persecuti deseruisse ad mea. Dicant offendit sea et, per esse timeam deserunt ut. In pri enim sadipscing, ei movet soleat suavitate vim. Mea et omnesque phaedrum, paulo luptatum concludaturque vim ea. -- LIBER. </p>
I want appply class to
<p class="bold"> USU EA EUISMOD HONESTATIS DETERRUISSET.</p> <p class="bold"> CU CONGUE IRIURE SCAEVOLA -- UT DOMING IRACUNDIA. </p> <p class="bold"> DICO TEMPOR HABEMUS -PART II, 123 </p>
I need Unicode solution for Cyrillic text. This not works:
Find what: <p(>\W*?[[:upper:]][[:upper:]\W]*?</p>) Replace with: <p class="bold"\1 _______________________________________________ Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Can anyone show how should look regular expression for this particular case?
this not works too:
<p(>\W*?[[p{Lu}]][[p{Lu}]\W]*?</p>)
Regards, Vesta
Sent: Sunday, July 31, 2016 at 3:32 PM From: "Lex Trotman" elextr@gmail.com To: "Geany general discussion list" users@lists.geany.org Subject: Re: [Geany-Users] Regular expression, for Unicode characters
Geany uses the Glib regex library whose syntax is described at https://developer.gnome.org/glib/stable/glib-regex-syntax.html
Cheers Lex
2016-07-31 22:03 GMT+10:00 Vesta laguna-mc@mail.com:
How to create regular expression tp match all UPPER CASE text within paragraps tag, and replace these <p> tag with <p class="bold">
<p> </p> <p> USU EA EUISMOD HONESTATIS DETERRUISSET.</p> <p>Qualisque mnesarchum no nam, usu cu fastidii delicata. Eu mei nonumy libris, quas movet vivendo vim at. Prima epicuri conceptam pro ad, in suas nonumes similique duo. Qui mundi essent complectitur eu. Ei laudem veritus democritum vis, te ferri appareat eos. Ceteros pertinacia ea eum, quo integre theophrastus ex, eum et sint omnes detracto. </p> <p>Usu ea euismod honestatis deterruisset. Ne quo malis meliore, duo viris liberavisse no, mea an vide mutat quodsi. Vis an vidit debitis, et noster aliquam pri, case iudicabit te sea. </p> <p> </p> <p> CU CONGUE IRIURE SCAEVOLA -- UT DOMING IRACUNDIA. </p> <p> DICO TEMPOR HABEMUS - PART II, 123 </p> <p>Homero everti ei nam. An liber euripidis vis, pericula persecuti deseruisse ad mea. Dicant offendit sea et, per esse timeam deserunt ut. In pri enim sadipscing, ei movet soleat suavitate vim. Mea et omnesque phaedrum, paulo luptatum concludaturque vim ea. -- LIBER. </p>
I want appply class to
<p class="bold"> USU EA EUISMOD HONESTATIS DETERRUISSET.</p> <p class="bold"> CU CONGUE IRIURE SCAEVOLA -- UT DOMING IRACUNDIA. </p> <p class="bold"> DICO TEMPOR HABEMUS -PART II, 123 </p>
I need Unicode solution for Cyrillic text. This not works:
Find what: <p(>\W*?[[:upper:]][[:upper:]\W]*?</p>) Replace with: <p class="bold"\1 _______________________________________________ Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Regular Expressions are a tad difficult to master.
Basic question: you're using lazy modifiers on purpose right? Just checking.
So, a dissection The regex engine (don't know what you're using) should hit \W*? and look for as few non word characters as possible (in some instances zero). Then it will look for ONE character in the character class [p{Lu}] (unicode?). Then it will look for zero or more instances of [p{Lu}] or a non-word character. This is until it gets to the closing tag. Since you're only looking for a single capital letter, why not try:
<p(>.*?[[p{Lu}]].*?</p>)
Or better yet, since you're only replacing the p tag with p class="bold" why not just capture the initial p tag:
(<p>).*?[[p{Lu}]].*?</p>
Hope that gives you some starting ideas.
On 07/31/2016 08:19 AM, Vesta wrote:
Can anyone show how should look regular expression for this particular case?
this not works too:
<p(>\W*?[[p{Lu}]][[p{Lu}]\W]*?</p>)
Regards, Vesta
Sent: Sunday, July 31, 2016 at 3:32 PM From: "Lex Trotman" elextr@gmail.com To: "Geany general discussion list" users@lists.geany.org Subject: Re: [Geany-Users] Regular expression, for Unicode characters
Geany uses the Glib regex library whose syntax is described at https://developer.gnome.org/glib/stable/glib-regex-syntax.html
Cheers Lex
2016-07-31 22:03 GMT+10:00 Vesta laguna-mc@mail.com:
How to create regular expression tp match all UPPER CASE text within paragraps tag, and replace these <p> tag with <p class="bold">
<p> </p> <p> USU EA EUISMOD HONESTATIS DETERRUISSET.</p> <p>Qualisque mnesarchum no nam, usu cu fastidii delicata. Eu mei nonumy libris, quas movet vivendo vim at. Prima epicuri conceptam pro ad, in suas nonumes similique duo. Qui mundi essent complectitur eu. Ei laudem veritus democritum vis, te ferri appareat eos. Ceteros pertinacia ea eum, quo integre theophrastus ex, eum et sint omnes detracto. </p> <p>Usu ea euismod honestatis deterruisset. Ne quo malis meliore, duo viris liberavisse no, mea an vide mutat quodsi. Vis an vidit debitis, et noster aliquam pri, case iudicabit te sea. </p> <p> </p> <p> CU CONGUE IRIURE SCAEVOLA -- UT DOMING IRACUNDIA. </p> <p> DICO TEMPOR HABEMUS - PART II, 123 </p> <p>Homero everti ei nam. An liber euripidis vis, pericula persecuti deseruisse ad mea. Dicant offendit sea et, per esse timeam deserunt ut. In pri enim sadipscing, ei movet soleat suavitate vim. Mea et omnesque phaedrum, paulo luptatum concludaturque vim ea. -- LIBER. </p>
I want appply class to
<p class="bold"> USU EA EUISMOD HONESTATIS DETERRUISSET.</p> <p class="bold"> CU CONGUE IRIURE SCAEVOLA -- UT DOMING IRACUNDIA. </p> <p class="bold"> DICO TEMPOR HABEMUS -PART II, 123 </p>
I need Unicode solution for Cyrillic text. This not works:
Find what: <p(>\W*?[[:upper:]][[:upper:]\W]*?</p>) Replace with: <p class="bold"\1 _______________________________________________ Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
<p(>\W*?[[p{Lu}]][[p{Lu}]\W]*?</p>) I just found this regex for unicode,Perl, somewhere and tried modify it, but it not works.
I have Geany 1.23.1, I browsed it regex syntax, but there is no any examples.
The text I want parse have multiple spaces inside paragraphs tags. Sometimes upper case text inside paragraphs are mixed with lower case characters or words - those paragraphs need be omitted. So we need match and apply bold class only to paragraphs, containing all upper case text, as in my examples.
I tried both regex but it not works. <p(>.*?[[p{Lu}]].*?</p>)
(<p>).*?[[p{Lu}]].*?</p>
Vesta
Sent: Tuesday, August 02, 2016 at 12:03 PM From: "James Ginns" starvagrant@yahoo.com To: "Geany general discussion list" users@lists.geany.org Subject: Re: [Geany-Users] Regular expression, for Unicode characters
Regular Expressions are a tad difficult to master.
Basic question: you're using lazy modifiers on purpose right? Just checking.
So, a dissection The regex engine (don't know what you're using) should hit \W*? and look for as few non word characters as possible (in some instances zero). Then it will look for ONE character in the character class [p{Lu}] (unicode?). Then it will look for zero or more instances of [p{Lu}] or a non-word character. This is until it gets to the closing tag. Since you're only looking for a single capital letter, why not try:
<p(>.*?[[p{Lu}]].*?</p>)
Or better yet, since you're only replacing the p tag with p class="bold" why not just capture the initial p tag:
(<p>).*?[[p{Lu}]].*?</p>
Hope that gives you some starting ideas.
On 07/31/2016 08:19 AM, Vesta wrote:
Can anyone show how should look regular expression for this particular case?
this not works too:
<p(>\W*?[[p{Lu}]][[p{Lu}]\W]*?</p>)
Regards, Vesta
Sent: Sunday, July 31, 2016 at 3:32 PM From: "Lex Trotman" elextr@gmail.com To: "Geany general discussion list" users@lists.geany.org Subject: Re: [Geany-Users] Regular expression, for Unicode characters
Geany uses the Glib regex library whose syntax is described at https://developer.gnome.org/glib/stable/glib-regex-syntax.html
Cheers Lex
2016-07-31 22:03 GMT+10:00 Vesta laguna-mc@mail.com:
How to create regular expression tp match all UPPER CASE text within paragraps tag, and replace these <p> tag with <p class="bold">
<p> </p> <p> USU EA EUISMOD HONESTATIS DETERRUISSET.</p> <p>Qualisque mnesarchum no nam, usu cu fastidii delicata. Eu mei nonumy libris, quas movet vivendo vim at. Prima epicuri conceptam pro ad, in suas nonumes similique duo. Qui mundi essent complectitur eu. Ei laudem veritus democritum vis, te ferri appareat eos. Ceteros pertinacia ea eum, quo integre theophrastus ex, eum et sint omnes detracto. </p> <p>Usu ea euismod honestatis deterruisset. Ne quo malis meliore, duo viris liberavisse no, mea an vide mutat quodsi. Vis an vidit debitis, et noster aliquam pri, case iudicabit te sea. </p> <p> </p> <p> CU CONGUE IRIURE SCAEVOLA -- UT DOMING IRACUNDIA. </p> <p> DICO TEMPOR HABEMUS - PART II, 123 </p> <p>Homero everti ei nam. An liber euripidis vis, pericula persecuti deseruisse ad mea. Dicant offendit sea et, per esse timeam deserunt ut. In pri enim sadipscing, ei movet soleat suavitate vim. Mea et omnesque phaedrum, paulo luptatum concludaturque vim ea. -- LIBER. </p>
I want appply class to
<p class="bold"> USU EA EUISMOD HONESTATIS DETERRUISSET.</p> <p class="bold"> CU CONGUE IRIURE SCAEVOLA -- UT DOMING IRACUNDIA. </p> <p class="bold"> DICO TEMPOR HABEMUS -PART II, 123 </p>
I need Unicode solution for Cyrillic text. This not works:
Find what: <p(>\W*?[[:upper:]][[:upper:]\W]*?</p>) Replace with: <p class="bold"\1 _______________________________________________ Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Hmm. Could you be more specific then? When you say it doesn't work, what kinds of lines is it missing and what kinds of lines is it catching? You could use a tool like regexpal to see what is and isn't matching. From the lack of descriptiveness in your message you might have just forgotten a semicolon for all anyone knows.
On 08/02/2016 06:59 AM, Vesta wrote:
<p(>\W*?[[p{Lu}]][[p{Lu}]\W]*?</p>) I just found this regex for unicode,Perl, somewhere and tried modify it, but it not works.
I have Geany 1.23.1, I browsed it regex syntax, but there is no any examples.
The text I want parse have multiple spaces inside paragraphs tags. Sometimes upper case text inside paragraphs are mixed with lower case characters or words - those paragraphs need be omitted. So we need match and apply bold class only to paragraphs, containing all upper case text, as in my examples.
I tried both regex but it not works. <p(>.*?[[p{Lu}]].*?</p>)
(<p>).*?[[p{Lu}]].*?</p>
Vesta
Sent: Tuesday, August 02, 2016 at 12:03 PM From: "James Ginns" starvagrant@yahoo.com To: "Geany general discussion list" users@lists.geany.org Subject: Re: [Geany-Users] Regular expression, for Unicode characters
Regular Expressions are a tad difficult to master.
Basic question: you're using lazy modifiers on purpose right? Just checking.
So, a dissection The regex engine (don't know what you're using) should hit \W*? and look for as few non word characters as possible (in some instances zero). Then it will look for ONE character in the character class [p{Lu}] (unicode?). Then it will look for zero or more instances of [p{Lu}] or a non-word character. This is until it gets to the closing tag. Since you're only looking for a single capital letter, why not try:
<p(>.*?[[p{Lu}]].*?</p>)
Or better yet, since you're only replacing the p tag with p class="bold" why not just capture the initial p tag:
(<p>).*?[[p{Lu}]].*?</p>
Hope that gives you some starting ideas.
On 07/31/2016 08:19 AM, Vesta wrote:
Can anyone show how should look regular expression for this particular case?
this not works too:
<p(>\W*?[[p{Lu}]][[p{Lu}]\W]*?</p>)
Regards, Vesta
Sent: Sunday, July 31, 2016 at 3:32 PM From: "Lex Trotman" elextr@gmail.com To: "Geany general discussion list" users@lists.geany.org Subject: Re: [Geany-Users] Regular expression, for Unicode characters
Geany uses the Glib regex library whose syntax is described at https://developer.gnome.org/glib/stable/glib-regex-syntax.html
Cheers Lex
2016-07-31 22:03 GMT+10:00 Vesta laguna-mc@mail.com:
How to create regular expression tp match all UPPER CASE text within paragraps tag, and replace these <p> tag with <p class="bold">
<p> </p> <p> USU EA EUISMOD HONESTATIS DETERRUISSET.</p> <p>Qualisque mnesarchum no nam, usu cu fastidii delicata. Eu mei nonumy libris, quas movet vivendo vim at. Prima epicuri conceptam pro ad, in suas nonumes similique duo. Qui mundi essent complectitur eu. Ei laudem veritus democritum vis, te ferri appareat eos. Ceteros pertinacia ea eum, quo integre theophrastus ex, eum et sint omnes detracto. </p> <p>Usu ea euismod honestatis deterruisset. Ne quo malis meliore, duo viris liberavisse no, mea an vide mutat quodsi. Vis an vidit debitis, et noster aliquam pri, case iudicabit te sea. </p> <p> </p> <p> CU CONGUE IRIURE SCAEVOLA -- UT DOMING IRACUNDIA. </p> <p> DICO TEMPOR HABEMUS - PART II, 123 </p> <p>Homero everti ei nam. An liber euripidis vis, pericula persecuti deseruisse ad mea. Dicant offendit sea et, per esse timeam deserunt ut. In pri enim sadipscing, ei movet soleat suavitate vim. Mea et omnesque phaedrum, paulo luptatum concludaturque vim ea. -- LIBER. </p>
I want appply class to
<p class="bold"> USU EA EUISMOD HONESTATIS DETERRUISSET.</p> <p class="bold"> CU CONGUE IRIURE SCAEVOLA -- UT DOMING IRACUNDIA. </p> <p class="bold"> DICO TEMPOR HABEMUS -PART II, 123 </p>
I need Unicode solution for Cyrillic text. This not works:
Find what: <p(>\W*?[[:upper:]][[:upper:]\W]*?</p>) Replace with: <p class="bold"\1 _______________________________________________ Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
I don't know why it not work. Both regex just don't match anything. Below is screen shots.
https://s31.postimg.org/myq22vtln/Screenshot_from_2016_08_02_23_17_55.png
https://s32.postimg.org/ktjn7ywp1/Screenshot_from_2016_08_02_23_19_15.png
Sent: Tuesday, August 02, 2016 at 4:58 PM From: "James Ginns" starvagrant@yahoo.com To: "Geany general discussion list" users@lists.geany.org Subject: Re: [Geany-Users] Regular expression, for Unicode characters
Hmm. Could you be more specific then? When you say it doesn't work, what kinds of lines is it missing and what kinds of lines is it catching? You could use a tool like regexpal to see what is and isn't matching. From the lack of descriptiveness in your message you might have just forgotten a semicolon for all anyone knows.
On 08/02/2016 06:59 AM, Vesta wrote:
<p(>\W*?[[p{Lu}]][[p{Lu}]\W]*?</p>) I just found this regex for unicode,Perl, somewhere and tried modify it, but it not works.
I have Geany 1.23.1, I browsed it regex syntax, but there is no any examples.
The text I want parse have multiple spaces inside paragraphs tags. Sometimes upper case text inside paragraphs are mixed with lower case characters or words - those paragraphs need be omitted. So we need match and apply bold class only to paragraphs, containing all upper case text, as in my examples.
I tried both regex but it not works. <p(>.*?[[p{Lu}]].*?</p>)
(<p>).*?[[p{Lu}]].*?</p>
Vesta
Sent: Tuesday, August 02, 2016 at 12:03 PM From: "James Ginns" starvagrant@yahoo.com To: "Geany general discussion list" users@lists.geany.org Subject: Re: [Geany-Users] Regular expression, for Unicode characters
Regular Expressions are a tad difficult to master.
Basic question: you're using lazy modifiers on purpose right? Just checking.
So, a dissection The regex engine (don't know what you're using) should hit \W*? and look for as few non word characters as possible (in some instances zero). Then it will look for ONE character in the character class [p{Lu}] (unicode?). Then it will look for zero or more instances of [p{Lu}] or a non-word character. This is until it gets to the closing tag. Since you're only looking for a single capital letter, why not try:
<p(>.*?[[p{Lu}]].*?</p>)
Or better yet, since you're only replacing the p tag with p class="bold" why not just capture the initial p tag:
(<p>).*?[[p{Lu}]].*?</p>
Hope that gives you some starting ideas.
On 07/31/2016 08:19 AM, Vesta wrote:
Can anyone show how should look regular expression for this particular case?
this not works too:
<p(>\W*?[[p{Lu}]][[p{Lu}]\W]*?</p>)
Regards, Vesta
Sent: Sunday, July 31, 2016 at 3:32 PM From: "Lex Trotman" elextr@gmail.com To: "Geany general discussion list" users@lists.geany.org Subject: Re: [Geany-Users] Regular expression, for Unicode characters
Geany uses the Glib regex library whose syntax is described at https://developer.gnome.org/glib/stable/glib-regex-syntax.html
Cheers Lex
2016-07-31 22:03 GMT+10:00 Vesta laguna-mc@mail.com:
How to create regular expression tp match all UPPER CASE text within paragraps tag, and replace these <p> tag with <p class="bold">
<p> </p> <p> USU EA EUISMOD HONESTATIS DETERRUISSET.</p> <p>Qualisque mnesarchum no nam, usu cu fastidii delicata. Eu mei nonumy libris, quas movet vivendo vim at. Prima epicuri conceptam pro ad, in suas nonumes similique duo. Qui mundi essent complectitur eu. Ei laudem veritus democritum vis, te ferri appareat eos. Ceteros pertinacia ea eum, quo integre theophrastus ex, eum et sint omnes detracto. </p> <p>Usu ea euismod honestatis deterruisset. Ne quo malis meliore, duo viris liberavisse no, mea an vide mutat quodsi. Vis an vidit debitis, et noster aliquam pri, case iudicabit te sea. </p> <p> </p> <p> CU CONGUE IRIURE SCAEVOLA -- UT DOMING IRACUNDIA. </p> <p> DICO TEMPOR HABEMUS - PART II, 123 </p> <p>Homero everti ei nam. An liber euripidis vis, pericula persecuti deseruisse ad mea. Dicant offendit sea et, per esse timeam deserunt ut. In pri enim sadipscing, ei movet soleat suavitate vim. Mea et omnesque phaedrum, paulo luptatum concludaturque vim ea. -- LIBER. </p>
I want appply class to
<p class="bold"> USU EA EUISMOD HONESTATIS DETERRUISSET.</p> <p class="bold"> CU CONGUE IRIURE SCAEVOLA -- UT DOMING IRACUNDIA. </p> <p class="bold"> DICO TEMPOR HABEMUS -PART II, 123 </p>
I need Unicode solution for Cyrillic text. This not works:
Find what: <p(>\W*?[[:upper:]][[:upper:]\W]*?</p>) Replace with: <p class="bold"\1 _______________________________________________ Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Text have multiple whitespaces between words within <p> </p> and <h2><h2> tags.
How to find multiple whitespaces and replace them with a single whitespace?
Regards, Alex
Le 05/08/2016 à 14:10, Vesta a écrit :
Text have multiple whitespaces between words within <p> </p> and <h2><h2> tags.
How to find multiple whitespaces and replace them with a single whitespace?
learn regexes? :) For basic stuff like that it isn't so complex, and very powerful. Though, here you could also do it just replacing two spaces with one until there's no more to replace.
Regards, Colomban
PS: [[:space:]]+
Thanks you for support. Regex is a quite tricky, however if there is no other way, regex is only solution.
[[:space:]]+
There is one small issue with this: it also removes space between </p> and <p> when paragraphs begins from new line, i.e. <p> first line text </p> <p> second line text </p>
so paragraphs merge in one line: <p> first line text </p> <p> second line text </p>
The same for headers and paragraphs:
<h1> text </h2> <p> text </p>
becomes <h1> text </h2> <p> text </p>
How to avoid this?
Best Regards, Alex
Sent: Friday, August 05, 2016 at 3:12 PM From: "Colomban Wendling" lists.ban@herbesfolles.org To: "Geany general discussion list" users@lists.geany.org Subject: Re: [Geany-Users] Remove Extra Whitespace from text
Le 05/08/2016 à 14:10, Vesta a écrit :
Text have multiple whitespaces between words within <p> </p> and <h2><h2> tags.
How to find multiple whitespaces and replace them with a single whitespace?
learn regexes? :) For basic stuff like that it isn't so complex, and very powerful. Though, here you could also do it just replacing two spaces with one until there's no more to replace.
Regards, Colomban
PS: [[:space:]]+ _______________________________________________ Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Le 05/08/2016 à 22:31, Vesta a écrit :
[…]
[[:space:]]+
There is one small issue with this: it also removes space between </p> and <p> when paragraphs begins from new line, i.e. […]
How to avoid this?
don't match newlines. " +" (without the quotes) is likely enough.
Le 31/07/2016 à 15:19, Vesta a écrit :
Can anyone show how should look regular expression for this particular case?
This will work:
(<p)(>[^[:lower:]]*[[:upper:]][^[:lower:]]*</p>)
It matches any *but* lowercase, then one upper character, then anything *but* lower characters. Using "not lowercase" is useful to allow punctuation and digits.
if you're interested in supporting uppercase <p> tags, you'll need to make quantifiers ungreedy too:
(<[pP])(>[^[:lower:]]*?[[:upper:]][^[:lower:]]*?</[pP]>)
Cheers, Colomban
Regex works fine -- Thank you.
B.Regards, Alex
Sent: Tuesday, August 02, 2016 at 3:17 PM From: "Colomban Wendling" lists.ban@herbesfolles.org To: "Geany general discussion list" users@lists.geany.org Subject: Re: [Geany-Users] Regular expression, for Unicode characters
Le 31/07/2016 à 15:19, Vesta a écrit :
Can anyone show how should look regular expression for this particular case?
This will work:
(<p)(>[^[:lower:]]*[[:upper:]][^[:lower:]]*</p>)
It matches any *but* lowercase, then one upper character, then anything *but* lower characters. Using "not lowercase" is useful to allow punctuation and digits.
if you're interested in supporting uppercase <p> tags, you'll need to make quantifiers ungreedy too:
(<[pP])(>[^[:lower:]]*?[[:upper:]][^[:lower:]]*?</[pP]>)
Cheers, Colomban _______________________________________________ Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
One note: how to replace <p> with <p class="bold"> in all matched lines?
Sent: Tuesday, August 02, 2016 at 3:17 PM From: "Colomban Wendling" lists.ban@herbesfolles.org To: "Geany general discussion list" users@lists.geany.org Subject: Re: [Geany-Users] Regular expression, for Unicode characters
Le 31/07/2016 à 15:19, Vesta a écrit :
Can anyone show how should look regular expression for this particular case?
This will work:
(<p)(>[^[:lower:]]*[[:upper:]][^[:lower:]]*</p>)
It matches any *but* lowercase, then one upper character, then anything *but* lower characters. Using "not lowercase" is useful to allow punctuation and digits.
if you're interested in supporting uppercase <p> tags, you'll need to make quantifiers ungreedy too:
(<[pP])(>[^[:lower:]]*?[[:upper:]][^[:lower:]]*?</[pP]>)
Cheers, Colomban _______________________________________________ Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
\1 class="bold"\2
How to alter this to apply <h2> </h2> tags in place of <p> </p> tags?
Best Regards, Vesta
Sent: Wednesday, August 03, 2016 at 2:16 AM From: "Colomban Wendling" lists.ban@herbesfolles.org To: "Geany general discussion list" users@lists.geany.org Subject: Re: [Geany-Users] Regular expression, for Unicode characters
Le 03/08/2016 à 00:49, Vesta a écrit :
One note: how to replace <p> with <p class="bold"> in all matched lines?
\1 class="bold"\2
or alter the RE to whatever capture you like best
Cheers, Colomban _______________________________________________ Users mailing list Users@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/users
Le 03/08/2016 à 02:13, Vesta a écrit :
\1 class="bold"\2
How to alter this to apply <h2> </h2> tags in place of <p> </p> tags?
You should try and understand the regex instead of using it as a mere magic solution.
…
But here you go:
(<p>)([^[:lower:]]*[[:upper:]][^[:lower:]]*)(</p>)
<h2>\2</h2>