Why surrounding text is the worst feature in the Linux input method world

This is mainly a complain about how mess this feature is and why no one could reliably use it.

To give people some background, surrounding text is about the feature that an application can notify the input method what are the characters around the cursor, and the input method can directly change the text around the cursor.

For example, in a input box, you have some text like this.

With surrounding text, application is able to notify input method the context around the cursor.

For example, in this case, the input method will receive text is “I like typing.”, the anchor is 8, and the cursor is 10. Anchor is the starting offset of the selection, and the cursor is the end of selection. If there is no selection, anchor will equal to cursor.

Now you may want to ask, isn’t it a costly thing to do? Answer is YES. Imagine you have a crazy long line in the editor, and whenever you change the text, you will need to send it over to the input method. Usually, input method would just apply a maximum size.

Next we will need to talk about the messiest thing about this is the API. Here lets list things about what are people doing with it.

  1. Gtk native API: set_surrounding_text / delete_surrounding_text, the value of offset is Unicode(UCS4) character based. delete_surrounding_text uses (offset, length) to define the range.
  2. Qt native API: the value of offset is UTF-16 character based. delete_surrounding_text uses (offset, length) to define the range, but, it excludes the current selected text when applying offset and length.
  3. Wayland protocol text-input-v1 / zwp_input_method_v1, similar to Qt, but offsets are UTF-8 character based.
  4. Wayland protocol text-input-v2 / text-input-v3 / zwp_input_method_v2 , delete_surrounding_text uses (before, after) to described the range. Basically it means some additional character before and after the selection. Offsets are also utf8 character based.
  5. Gtk implementation of text-input-v3 (?!), does not follow (4), by just using received UTF-8 offset as Unicode (UCS4) based offset, which is actually a bug. Also, it does not actively sending over the update of surrounding text, which makes it useless.

Also, people seems to not have a clear definition about whether surrounding text should include preedit text. Which is purely headache to deal with.

Not to mention that non-native widget implemented with Gtk/Qt are very likely to implement it in a wrong way. Also XIM does not support it. Not to mention that terminal application that does not support it have to claim it support surrounding text, due to lacking of ability to notify application.

So now, people are more likely to stick to use a limited set of feature in surrounding text.

  1. Use it as auxiliary data like primary selection, to just learn about what text is being selected.
  2. Delete surrounding text only when it is extremely reliable, e.g. delete 1 cursor before cursor.
  3. When implementing a feature that requires full featured surrounding text, make this feature optional and always provides an alternative easy way for user to not using it.

This entry was posted in fcitx development and tagged , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.