Temporarily add ' (single quote) to WORDCHARS to be able to parse whole words containing a single quote (like doesn't and similar in English).
Closes #320. You can view, comment on, or merge this pull request online at:
https://github.com/geany/geany-plugins/pull/322
-- Commit Summary --
* Improve detection of English contractions and other use of single quotes
-- File Changes --
M spellcheck/src/speller.c (21)
-- Patch Links --
https://github.com/geany/geany-plugins/pull/322.patch https://github.com/geany/geany-plugins/pull/322.diff
--- Reply to this email directly or view it on GitHub: https://github.com/geany/geany-plugins/pull/322
g_return_val_if_fail(sc_speller_dict != NULL, 0); g_return_val_if_fail(doc != NULL, 0); g_return_val_if_fail(line != NULL, 0);
- /* add ' (single quote) temporarily to wordchars
* to be able to check for "doesn't", "isn't" and similar */
- wordchars_len = scintilla_send_message(doc->editor->sci, SCI_GETWORDCHARS, 0, 0);
- wordchars_orig = g_malloc0(wordchars_len + 1);
- scintilla_send_message(doc->editor->sci, SCI_GETWORDCHARS, 0, (sptr_t)wordchars_orig);
- if (! strchr(wordchars_orig, '''))
- {
GString *wordchars_new = g_string_new(wordchars_orig);
depending on whether it's a hot spot, you could also simply add the `'` to the orig string and truncate the last byte afterward (instead of copying the string)
--- Reply to this email directly or view it on GitHub: https://github.com/geany/geany-plugins/pull/322/files#r48687409
LGTM, but if people use `’` and stuff like that. But I guess fully correct word boundary recognition requires a clever library like Pango or whatever, and it's tricky; so I guess adding `'` is good enough at least for now.
--- Reply to this email directly or view it on GitHub: https://github.com/geany/geany-plugins/pull/322#issuecomment-168427901
g_return_val_if_fail(sc_speller_dict != NULL, 0); g_return_val_if_fail(doc != NULL, 0); g_return_val_if_fail(line != NULL, 0);
- /* add ' (single quote) temporarily to wordchars
* to be able to check for "doesn't", "isn't" and similar */
- wordchars_len = scintilla_send_message(doc->editor->sci, SCI_GETWORDCHARS, 0, 0);
- wordchars_orig = g_malloc0(wordchars_len + 1);
- scintilla_send_message(doc->editor->sci, SCI_GETWORDCHARS, 0, (sptr_t)wordchars_orig);
- if (! strchr(wordchars_orig, '''))
- {
GString *wordchars_new = g_string_new(wordchars_orig);
Yeah, nice idea. What exactly do you mean by "whether it's a hot spot"?
--- Reply to this email directly or view it on GitHub: https://github.com/geany/geany-plugins/pull/322/files#r48687486
Yes, when anything else is used for contractions/short forms than a single quote, it will fail. Though, to my knowledge, this also would not correct English then anymore. Also, in English as well as in German and maybe other languages as well, the single quote is used for things like "John Doe's shoes" and I guess using any other character than a single quote is not valid then.
--- Reply to this email directly or view it on GitHub: https://github.com/geany/geany-plugins/pull/322#issuecomment-168428317
g_return_val_if_fail(sc_speller_dict != NULL, 0); g_return_val_if_fail(doc != NULL, 0); g_return_val_if_fail(line != NULL, 0);
- /* add ' (single quote) temporarily to wordchars
* to be able to check for "doesn't", "isn't" and similar */
- wordchars_len = scintilla_send_message(doc->editor->sci, SCI_GETWORDCHARS, 0, 0);
- wordchars_orig = g_malloc0(wordchars_len + 1);
- scintilla_send_message(doc->editor->sci, SCI_GETWORDCHARS, 0, (sptr_t)wordchars_orig);
- if (! strchr(wordchars_orig, '''))
- {
GString *wordchars_new = g_string_new(wordchars_orig);
whether the code is "hot" (called often), so basically whether optimizing this is worth the (trivial) added subtlety.
--- Reply to this email directly or view it on GitHub: https://github.com/geany/geany-plugins/pull/322/files#r48697465
`’` is the typographic [apostrophe](https://en.wikipedia.org/wiki/Apostrophe), and is technically valid (and even more accurate) than the typewriter one `'`. But in practice it's less likely to be encountered in Geany I guess, because `'` is more common outside typesetting programs.
--- Reply to this email directly or view it on GitHub: https://github.com/geany/geany-plugins/pull/322#issuecomment-168542742
Merged #322.
--- Reply to this email directly or view it on GitHub: https://github.com/geany/geany-plugins/pull/322#event-506887868
github-comments@lists.geany.org