In the last few years, I have had issues with application servers using a large amount of CPU and even hanging application servers running the Tiny Spellchecking service. It ended with disabled spellchecking in the Tiny Editors’ config.js
.
SharedDirectory/customization/javascript/tiny/editors/connections/config.js
...
// Set to false to disable Tiny's spell checking service in TinyMCE and Textbox.io.
spellingServiceEnabled: false,
...
I worked with HCL and Tiny Support on these issues, and they provided updates during the last year. This should have been fixed since TinyMCE 5.9.
Now, after updating to the actual editor version, TinyMCE 5.10.2, we decided to re-enable the spellchecker, and for the first few days it looked like the issue was really resolved. Sadly, after about a week, the first application server started to use 800% CPU just for the server hosting the spelling service.
In the application server logs, we found messages like:
So first, we see debug messages without enabling a trace, and on the top of the image, we see that a request ran over 1000 ms.
Support sent me the steps to disable the debug messages:
- Create a file called
/opt/ephox/logback.xml
|
|
Important is line 9, which is set to DEBUG for TinyMCE 5.10.2, but WARN or ERROR will prevent these log messages.
- Add a custom JVM property (Server > Server Types > WebSphere Application Servers → server name > Process Definition > Java Virtual Machine > Custom Properties) to the application server where you installed the spellchecker.
logback.configurationFile: /opt/ephox/logback.xml
After this, the performance was slightly better, but still not good.
Today, I got the following update from Tiny:
Broadly, we believe that WinterTree spelling library is having problems with long words with possible hyphens, especially in German. In this case, we recommending trying the Hunspell library instead.
We can see that the problem language is always German, and the number of characters is higher than 20. Due to implementation aspects with how WinterTree’s spelling engine works, these cases can be particularly problematic.
The most egregious offender is:
Took 25270 milliseconds.
Which meant that it took over 25 seconds to generate suggestions for 1 word in a document. As you can imagine, when this starts happening, sending lots of words becomes a problem. However, there aren’t many words that take more than 1 second to generate, because this is the entire list in the logs sent to us.
In general, you could likely avoid this behavior by using Hunspell libraries, particularly for German. Here is our documentation about adding Hunspell dictionaries to Spellchecker Pro. You likely have specific separate instructions for setting up Hunspell, but it will be effectively the same under the hood, as it’s a server-only setting.
https://www.tiny.cloud/docs/tinymce/6/self-hosting-hunspell/
So here I could stop and point you to support, but I have had some issues during the activation of Hunspell so far.
First, the webpage says, “Tiny provides two downloadable bundles of Hunspell dictionaries,” which I couldn’t find. So I searched for other download options. The best match were the dictionaries included with LibreOffice : https://github.com/libreoffice/dictionaries , but the folder structure and naming do not match the one requested by Tiny.
#!/usr/bin/env bash
git clone https://github.com/LibreOffice/dictionaries.git /tmp/dictionaries
for i in af_ZA da de_DE en_AU en_CA en_GB en_US es fr hu it_IT nb_NO nl_NL nn pl pt_BR pt_PT sv_FI sv_SE ; do
mkdir -p /opt/ephox/hunspell-dictionaries/$i
find /tmp/dictionaries -iname $i*.aff -exec cp {} /opt/ephox/hunspell-dictionaries/$i/$i.aff \;
find /tmp/dictionaries -iname $i*.dic -exec cp {} /opt/ephox/hunspell-dictionaries/$i/$i.dic \;
done
This script creates the expected folder structure and copies the dictionaries to the right place.
tree /opt/ephox/hunspell-dictionaries/
/opt/ephox/hunspell-dictionaries/
├── af_ZA
│ ├── af_ZA.aff
│ └── af_ZA.dic
├── da
│ ├── da.aff
│ └── da.dic
├── de_DE
│ ├── de_DE.aff
│ └── de_DE.dic
├── en_AU
│ ├── en_AU.aff
│ └── en_AU.dic
├── en_CA
│ ├── en_CA.aff
│ └── en_CA.dic
├── en_GB
│ ├── en_GB.aff
│ └── en_GB.dic
├── en_US
│ ├── en_US.aff
│ └── en_US.dic
├── es
│ ├── es.aff
│ └── es.dic
├── fr
│ ├── fr.aff
│ └── fr.dic
├── hu
│ ├── hu.aff
│ └── hu.dic
├── it_IT
│ ├── it_IT.aff
│ └── it_IT.dic
├── nb_NO
│ ├── nb_NO.aff
│ └── nb_NO.dic
├── nl_NL
│ ├── nl_NL.aff
│ └── nl_NL.dic
├── nn
│ ├── nn.aff
│ └── nn.dic
├── pl
│ ├── pl.aff
│ └── pl.dic
├── pt_BR
│ ├── pt_BR.aff
│ └── pt_BR.dic
├── pt_PT
│ ├── pt_PT.aff
│ └── pt_PT.dic
├── sv_FI
│ ├── sv_FI.aff
│ └── sv_FI.dic
└── sv_SE
├── sv_SE.aff
└── sv_SE.dic
19 directories, 38 files
Now we have to enable the Hunspell-dictionaries in /opt/ephox/application.conf
and restart the spellchecking service.
|
|
Don’t forget to enable spell checking in SharedDirectory/customization/javascript/tiny/editors/connections/config.js
...
// Set to false to disable Tiny's spell checking service in TinyMCE and Textbox.io.
spellingServiceEnabled: true,
...
Results
I tested with WinterTree (default) and Hunspell.
Testing some long words with WinterTree
[7/12/22 17:35:35:152 UTC] 00000132 SystemOut O 2022-07-12 17:35:35.152Z [ioapp-compute-1] INFO ironbark - request [ uuid-47ac0625-f6dc-4876-8127-59b50595cd0f ] Response => Status: 200 OK (12 ms)
[7/12/22 17:35:35:212 UTC] 00000139 SystemOut O 2022-07-12 17:35:35.212Z [ioapp-compute-4] DEBUG ironbark - request [ uuid-ac3ac5bf-eb12-4e72-a98b-a9c93f288093 ] Spellall (100.0 % - 1 / 1 incorrect)
[7/12/22 17:35:35:212 UTC] 00000139 SystemOut O 2022-07-12 17:35:35.212Z [ioapp-compute-4] DEBUG ironbark - request [ uuid-ac3ac5bf-eb12-4e72-a98b-a9c93f288093 ] Spellall (1 words) (BEGIN)
[7/12/22 17:35:38:865 UTC] 00000139 SystemOut O 2022-07-12 17:35:38.865Z [ioapp-compute-4] WARN ironbark -
request [ uuid-ac3ac5bf-eb12-4e72-a98b-a9c93f288093 ] PERFORMANCE_ALERT: word took longer than 1000 milliseconds. Took 3652 milliseconds.
* Language: de
* Number of characters: 48
* Number of hyphens: 0
* Number of apostrophes: 0
* Number of suggestions generated: 16
[7/12/22 17:35:38:865 UTC] 00000139 SystemOut O 2022-07-12 17:35:38.865Z [ioapp-compute-4] DEBUG ironbark - request [ uuid-ac3ac5bf-eb12-4e72-a98b-a9c93f288093 ] Spellall (1 words) (END)
[7/12/22 17:35:38:866 UTC] 00000139 SystemOut O 2022-07-12 17:35:38.866Z [ioapp-compute-4] INFO ironbark - request [ uuid-9347efc7-7705-4bcb-911c-1506d1d3b90a ] Response => Status: 200 OK (3726 ms)
We see the request needs 3.6 seconds and the word was 48 characters long.
Testing the same with Hunspell enabled
[7/12/22 20:10:12:798 UTC] 00000134 SystemOut O 2022-07-12 20:10:12.798Z [ioapp-compute-4] DEBUG ironbark - request [ uuid-0958072a-bf4c-4cb6-8acd-e2e7e8fb2870 ] Spellall (7 words) (BEGIN)
[7/12/22 20:10:12:798 UTC] 00000134 SystemOut O 2022-07-12 20:10:12.798Z [ioapp-compute-4] DEBUG ironbark - request [ uuid-0958072a-bf4c-4cb6-8acd-e2e7e8fb2870 ] Spellall (7 words) (END)
[7/12/22 20:10:12:800 UTC] 00000132 SystemOut O 2022-07-12 20:10:12.800Z [ioapp-compute-2] DEBUG ironbark - request [ uuid-199c0261-36e8-4173-807a-13a4a8ebce6b ] Spellall (0.0 % - 0 / 1 incorrect)
[7/12/22 20:10:12:801 UTC] 00000132 SystemOut O 2022-07-12 20:10:12.800Z [ioapp-compute-2] DEBUG ironbark - request [ uuid-199c0261-36e8-4173-807a-13a4a8ebce6b ] Spellall (1 words) (BEGIN)
[7/12/22 20:10:12:801 UTC] 00000132 SystemOut O 2022-07-12 20:10:12.801Z [ioapp-compute-2] DEBUG ironbark - request [ uuid-199c0261-36e8-4173-807a-13a4a8ebce6b ] Spellall (1 words) (END)
[7/12/22 20:10:12:801 UTC] 00000134 SystemOut O 2022-07-12 20:10:12.801Z [ioapp-compute-4] INFO ironbark - request [ uuid-4655f8a9-a466-4ad4-8874-d91f5fc8fc9b ] Response => Status: 200 OK (18 ms)
[7/12/22 20:10:12:801 UTC] 0000012f SystemOut O 2022-07-12 20:10:12.801Z [ioapp-compute-1] INFO ironbark - request [ uuid-9117e582-90f5-4246-bd17-56d00c12b975 ] Request => POST /tiny-spelling/2/suggestions
[7/12/22 20:10:12:803 UTC] 00000132 SystemOut O 2022-07-12 20:10:12.803Z [ioapp-compute-2] INFO ironbark - request [ uuid-938f2c71-701b-4312-9183-426b81829297 ] Response => Status: 200 OK (16 ms)
[7/12/22 20:10:12:803 UTC] 00000133 SystemOut O 2022-07-12 20:10:12.803Z [ioapp-compute-3] INFO ironbark - request [ uuid-67b5ef69-2079-43eb-908f-bd0017f715e2 ] Request => POST /tiny-spelling/2/suggestions
[7/12/22 20:10:12:808 UTC] 0000012f SystemOut O 2022-07-12 20:10:12.808Z [ioapp-compute-1] DEBUG ironbark - request [ uuid-202c5596-0fae-4cc4-8793-7feb458b3b0c ] Incoming suggestions-V2 request for: 1 word(s) in language: de from API Key: none
[7/12/22 20:10:12:811 UTC] 0000012f SystemOut O 2022-07-12 20:10:12.811Z [ioapp-compute-1] DEBUG ironbark - request [ uuid-202c5596-0fae-4cc4-8793-7feb458b3b0c ] Spellall (0.0 % - 0 / 1 incorrect)
[7/12/22 20:10:12:811 UTC] 0000012f SystemOut O 2022-07-12 20:10:12.811Z [ioapp-compute-1] DEBUG ironbark - request [ uuid-202c5596-0fae-4cc4-8793-7feb458b3b0c ] Spellall (1 words) (BEGIN)
[7/12/22 20:10:12:812 UTC] 0000012f SystemOut O 2022-07-12 20:10:12.811Z [ioapp-compute-1] DEBUG ironbark - request [ uuid-202c5596-0fae-4cc4-8793-7feb458b3b0c ] Spellall (1 words) (END)
[7/12/22 20:10:12:814 UTC] 0000012f SystemOut O 2022-07-12 20:10:12.814Z [ioapp-compute-1] INFO ironbark - request [ uuid-9117e582-90f5-4246-bd17-56d00c12b975 ] Response => Status: 200 OK (13 ms)
[7/12/22 20:10:12:817 UTC] 00000132 SystemOut O 2022-07-12 20:10:12.817Z [ioapp-compute-2] DEBUG ironbark - request [ uuid-8f7cc6c4-60f7-4535-ad4a-04a49aa4b389 ] Incoming suggestions-V2 request for: 1 word(s) in language: de from API Key: none
[7/12/22 20:10:12:819 UTC] 00000132 SystemOut O 2022-07-12 20:10:12.819Z [ioapp-compute-2] DEBUG ironbark - request [ uuid-8f7cc6c4-60f7-4535-ad4a-04a49aa4b389 ] Spellall (0.0 % - 0 / 1 incorrect)
[7/12/22 20:10:12:819 UTC] 00000132 SystemOut O 2022-07-12 20:10:12.819Z [ioapp-compute-2] DEBUG ironbark - request [ uuid-8f7cc6c4-60f7-4535-ad4a-04a49aa4b389 ] Spellall (1 words) (BEGIN)
[7/12/22 20:10:12:819 UTC] 00000132 SystemOut O 2022-07-12 20:10:12.819Z [ioapp-compute-2] DEBUG ironbark - request [ uuid-8f7cc6c4-60f7-4535-ad4a-04a49aa4b389 ] Spellall (1 words) (END)
[7/12/22 20:10:12:822 UTC] 00000132 SystemOut O 2022-07-12 20:10:12.821Z [ioapp-compute-2] INFO ironbark - request [ uuid-67b5ef69-2079-43eb-908f-bd0017f715e2 ] Response => Status: 200 OK (18 ms)
[7/12/22 20:10:12:854 UTC] 00000133 SystemOut O 2022-07-12 20:10:12.854Z [ioapp-compute-3] INFO ironbark - request [ uuid-0b4d740e-a06b-4a1a-a75e-e6a680a2d41d ] Request => POST /tiny-spelling/2/suggestions
[7/12/22 20:10:12:860 UTC] 00000135 SystemOut O 2022-07-12 20:10:12.860Z [ioapp-compute-5] DEBUG ironbark - request [ uuid-15c6b13d-a3f7-4fbb-8b43-6bf6a6074b26 ] Incoming suggestions-V2 request for: 1 word(s) in language: de from API Key: none
[7/12/22 20:10:12:862 UTC] 00000135 SystemOut O 2022-07-12 20:10:12.862Z [ioapp-compute-5] DEBUG ironbark - request [ uuid-15c6b13d-a3f7-4fbb-8b43-6bf6a6074b26 ] Spellall (0.0 % - 0 / 1 incorrect)
[7/12/22 20:10:12:862 UTC] 00000135 SystemOut O 2022-07-12 20:10:12.862Z [ioapp-compute-5] DEBUG ironbark - request [ uuid-15c6b13d-a3f7-4fbb-8b43-6bf6a6074b26 ] Spellall (1 words) (BEGIN)
[7/12/22 20:10:12:862 UTC] 00000135 SystemOut O 2022-07-12 20:10:12.862Z [ioapp-compute-5] DEBUG ironbark - request [ uuid-15c6b13d-a3f7-4fbb-8b43-6bf6a6074b26 ] Spellall (1 words) (END)
[7/12/22 20:10:12:864 UTC] 00000135 SystemOut O 2022-07-12 20:10:12.864Z [ioapp-compute-5] INFO ironbark - request [ uuid-0b4d740e-a06b-4a1a-a75e-e6a680a2d41d ] Response => Status: 200 OK (10 ms)
So for German spellchecking, it appears that Hunspell is working faster and giving suggestions even for long words. No, high CPU or waiting message has appeared so far. I never thought about these long German words until I read the answer from Tiny Support. When your users write documents in Connections in German, I would suggest you change the spellchecker too.