As an Linux application developer, one might not aware that there could be certain effort required to support Input Method (or Input Method Editor, usually referred as IME) under Linux.
What is input method and why should I care about it?
Even if you are not aware, you are probably already using it in daily life. For example, the virtual keyboard on your smart phone is a form of input method. You may noticed that the virtual keyboard allows you to type something, and gives you a list of words based on what you already partially typed. That is a very simple use case of input method. But for CJKV (Chinese, Japanese, Korean, Vietnamese) users, Input method is necessary for them to type their own language properly. Basically imagine this: you only have 26 English key on the keyboard, how could you type thousands of different Chinese characters by a physical keyboard with only limited keys? The answers, using a mapping that maps a sequence of key into certain characters. In order to make it easy to memorize, usually such mapping is similar to what is called Transliteration , or directly use an existing Romanization system.
For example, the most popular way for typing Chinese is Hanyu Pinyin.
What do I need to do to support Input method?
The state of art of input method on Linux are all server-client based frameworks. The client is your application, and the server is the input method server. Usually, there is also a third daemon process that works as a broker to transfer the message between the application and the input method server.
1. Which GUI toolkit to use?
Gtk & Qt
If you are using Gtk, Qt, there is a good news for you. There is usually nothing you need to do to support input method. Those Gtk toolkit provides a generic abstraction and sometimes even an extensible plugin system (Gtk/Qt case) behind to hide all the complexity for the communication between input method server and application.
The built-in widget provided by Gtk or Qt already handles everything need for input method. Unless you are implementing your own fully custom widget, you do not need to use any input method API. If you need your custom widget, which sometimes happens, you can also use the API provided by the toolkit to implement it.
Here are some pointers to the toolkit API:
Gtk: gtk_im_multicontext_new GtkIMContext
Qt: https://doc.qt.io/qt-6/qinputmethod.html https://doc.qt.io/qt-6/qinputmethodevent.html
The best documentation about how to use those API is the built-in widget implementation.
SDL & winit
If you are using SDL, or rust’s winit, which does have some sort of input method support, but lack of built-in widget (There might be third-party library based on them, which I have no knowledge of), you will need to refer to their IME API to do some manual work, or their demos.
Refer to their offical documentation and examples for the reference:
Xlib & XCB
Xlib has built-in XIM protocol support, which you may access via Xlib APIs. I found a good article about how to add input method support with Xlib at:
As for XCB, you will need to use a third-party library. I wrote one for XCB for both server and client side XIM. If you need a demo of it, you can find one at:
Someone also wrote a rust binding for it, which is used by wezterm in real world project. Some demo code can be found at:
As for writing a native wayland application from scratch with wayland-client, then you will want to pick the client side input method protocol first. The only common well supported (GNOME, KWin, wlroots, etc, but not weston, just FYI) one is:
2. How to write one with the APIs above?
If you use a toolkit with widget that can already support input method well, you can skip this and call it a day. But if you need to use low level interaction with input method, or just interested in how this works, you may continue to read. Usually it involves following steps:
- Create a connection to input method service.
- Tell input method, you want to communicate with it.
- Keyboard event being forwarded to input method
- input method decide how key event is handled.
- Receives input method event that carries text that you need to show, or commit to the application.
- Tell input method you are done with text input
- Close the connection when your application ends, or the relevant widget destructs.
The 1st step sometimes contains two steps, a. create connection. b. create a server side object that represent a micro focus of your application. Usually, this is referred as “Input Context”. The toolkit may hide the these complexity with their own API.
Take Xlib case as an example:
- Create the connection: XOpenIM
- Create the input context: XCreateIC
- Tell input method your application wants to use text input with input method: XSetICFocus
- Forward keyevent to input method: XFilterEvent
- Get committed text with XLookupString
- When your widget/window lost focus, XUnsetICFocus
- Clean up: XDestroyIC, XCloseIM.
Take wayland-client + text-input-v3 as an example
- Get global singleton object from registry: zwp_text_input_manager_v3
- Call zwp_text_input_manager_v3.get_text_input
- Call zwp_text_input_v3.enable
- Key event is forward to input method by compositor, nothing related to keyboard event need to be done on client side.
- Get committed text zwp_text_input_v3.commit_string
- Call zwp_text_input_v3.disable
- Destroy relevant wayland proxy object.
And always, read the example provided by the toolkit to get a better idea.
3. Some other concepts except commit the text
Support input method is not only about forwarding key event and get text from input method. There are some more interaction required between application and input method that is important to give better user experience.
Preedit is a piece of text that is display by application that represents the composing state. See the screenshot at the beginning of this article, the “underline” text is the “preedit”. Preedit contains the text and optionally some formatting information to show some rich information.
Surrounding text is an optional information that application can provide to input method. It contains text around the cursor, where the cursor and user selection is. Input method may use those information to provide better prediction. For example, if your text box has “I love |” ( | is the cursor). With surrounding text, input method will know that there is already “I love ” in the box and may predict your next word as “you” so you don’t need to type “y-o-u” but just select from the prediciton.
Surrounding text is not supported by XIM. Also, not all application can provide valid surrounding text information, for example terminal app.
Reporting cursor position on the window
Many input method engine needs to show a popup window to display some information. In order to allow input method place the window just at the position of the cursor (blinking one), application will need to let input method know where the cursor is.
Notify the state change that happens on the application side
For example, even if user is in the middle of composing something, they may still choose to use mouse click another place in the text box, or the text content is changed programmatically by app’s own logic. When such things happens, application may need to notify that the state need a “reset”. Usually this is also called “reset” in the relevant API.
What about Wayland and no needs to handle key-events? I understood, when requiring input method connection, Wayland compositor takes care of doing anything and send results/processed events to application. But – what if we made game and needs to handle w a s d? I see – we register to input method server, when user need to type text and unregister, when user is excepted to play game. It is some odd.
@Sławomir Lach , you don’t need to consider this if your game doesn’t require text typing.
But for example, if your game supports “chat”, or “allow user to type their own name”, you may want to consider this.
Also, it won’t affect any of your regular existing code logic. Here’s how it works, the game just remain as is. But only until user need to start type text, e.g. they press enter and a text box show up. Then you call some API SDL_StartTextInput. Under text-input model, the key event will be first forwarded to compositor and input method may start send you some specialized event and you just need to have some new code to handle them properly. When the text box is gone, you just call SDL_StopTextInput and everything restore to normal. You don’t really want to have this enabled all the time, because it may introduce key delay because the key event goes a longer trip when you enable the input method.
The “request a connection” concept doesn’t really mean you start a socket connection or sth under all circumstances. Maybe some times it’s just “initialize a global object” or acquire a global object from compositor.