得到上面的公式之后,为什么特意弄出 log1p(exp10(a-b)) 的形式呢?是因为标准库中提供了这样一个函数 log1p10exp,可以一步到位计算这个数值。为了更加精确,可以根据 a b 相对大小来选择是 a – b 还是 b – a。其中 log(10) (e为底)这种常数都可以预先计算,就将 math 库函数调用精简到了 1 次。
可能你会问,原本公式的 w 和 (1 – w) 的权重去哪了呢?这个问题也很简单,你只需要把他们先转化为对数计算即可。
libime 这里采取了类似但做了一些自己的改进。第一,将用户输入的词按句子记录,第二,将用户输入的句子分成三个不同大小的 pool (128,8192,65536),每个 pool 给予不同的概率权重。(具体的数字的选择纯粹就是所谓的 heuristic 的了),这样相对保证记住的历史较多(65536个句子 vs 1024 个词),同时也有根据输入时间有衰减的效果。
Kimpanel is a plasma applet that uses plasma and dbus to display the input method popup window. In X11, people who want to have native plasma theme based input method window may use it to provide a nice integration with plasma.
So you might ask, we already having kimpanel in Plasma desktop, what’s point to have this feature in Fcitx 5?
Well, if you use the wayland.. you will notice that kimpanel does not work properly in terms of window positioning. The input window is a small popup window used by input method. It needs to be shown at the cursor position in order to make user eye focused at the point where they are typing. This popup window is critical for CJK input method users.
And you might ask again, why can’t we just fix kimpanel? Unfortunately, it’s hard to fix.
There are quite a few technical difficulties behind this. Kimpanel applet currently runs in plasmashell. Unlike the gnome implementation (also maintained by me, BTW :D), running within the compositor, Plasma’s kimpanel right now have no ability to obtain the information of other windows nor to move the window position freely. Kimpanel requires following things to make it work:
If the client cursor position is absolute, move it to the position.
If the client cursor position is relative, move it to current window top left corner + offset.
For text-input client, there’s no position sending to input method, and compositor need to help input method to move the popup.
Unfortunately, to implement this support in wayland is really hard and would involves lots of changes in KWin. That somehow defeat the point to all the works we have done for zwp-input-method-v1 in KWin, because zwp-input-method-v1 protocol already has a concept of popup surface (need to be a surface from the same input method process). So I never try to do that due to the reasons above. Only until recently, I learned that KWin script can actually show real QML items, so I make a prototype that runs kimpanel within the KWin. You hit lots of KWin issues during writing the prototype, including kwin crash, flicker, etc. Luckily we are making progress with the help from KWin developer on unblocking the possibility of porting kimpanel to kwin (only for the popup, the action panel will still be in plasmashell). But until all the known issue are resolved, I can’t really submit the change the implementation of kimpanel from plasma-desktop to kwin script otherwise it will be totally broken. Also, we’d like to only use this on wayland, because for X11, expert user may choose not to use kwin at all.
Back to the original topic, Fcitx 5 is hardcoded to use client side input panel on KDE Wayland. The client side input panel always uses the classic ui theme (im module load classicui theme and render it). If kimpanel is allowed to be used wayland right now, the popup won’t show at the right position. In order to use Plasma theme, we need some support in the classicui addon of Fcitx 5. I wrote a simple tool to automatically generate a fcitx theme from plasma theme. I used to have such a tool around fcitx 4 era. This time, it’s even better, because we integrate the support of this tool in Fcitx 5 natively. If the Plasma theme is selected in classicui, it will run a small daemon, keep monitoring the plasma theme change and automatically regenerate the theme when needed. It can also be used as an standalone tool to generate the theme for one-time.
To use this feature, just get the latest stable version of fcitx5 & fcitx5-configtool. Simply choose “KDE Plasma (Experimental)” as Theme in fcitx5’s classic ui configuration, and that’s it.
decoder 的输入是一个 Segment Graph,而输出则是一个 Lattice(网格)。这里发生了什么呢?简单来说,就是把 Segment Graph 的上每一条路径,都放到词典上去匹配。具体采用的匹配方式由词典决定。例如,对于上面的例子,a na na 这种切分,就要匹配 a na na,a na,a,na na,na 几种不同的路径。针对这种路径构建出来的 Lattice 就类似:
当然,这仅仅是针对 a na na 这一种切分,其他的切分还会在上面加上不同的边。而上图的每条边上标注的拼音,在使用 Pinyin 词典进行 decoder 时就会去词典中进行匹配。每一条边都将对应的 SegmentGraph 中的 SegmentGraphNode,扩展为对应的词的 LatticeNode。
这里我们稍微换一个别的例子来进行解释。例如拼音内切分的经典例子xian(xi an)。在 xian 这个前缀所对应的节点,我们就有通过 xian 作为整体匹配的「先,线,县,现……」。而在 xi 这个前缀节点,则只有 xi(西,喜,系,洗……),然后这些 xi 的节点又通过 an 可以到达 xian 这个 SegmentGraphNode。例如有 an 「俺,按,安,……)。最后,还有直接通过 xi an 作为整词到达 xian 的(西安)。
This article intends to explain the technical details between a issue happens when using fcitx5 on Vivaldi. I’m not a Vivaldi user and Vivaldi is not fully open source, so I can’t really comment what change actually caused this, but I’ll just describe my findings. Based on the information from forum post and social network, the issue happens on vivaldi 5.2 but not 5.1.
When open multiple vivaldi windows, and close the vivaldi window afterwards, the whole browser may be closed instead. Only happened when fcitx5 im module is used.
What actually happened?
So first of all, let me talk about some technical details of how fcitx5 im module works: Fcitx 5 im module a plugin to Gtk 3 library used by vivaldi. In the plugin code, it initiates a DBus connection to dbus-daemon, and using dbus to interact with Fcitx 5 server.
Based on some debugging by attach to the relevant vivaldi process, the actual cause of the exiting is the dbus connection being closed. Fcitx 5 im module use gdbus API in gio to get a shared dbus connection (shared means shared within this process) and using it to handle dbus communications. For such shared dbus connection, a property “exit-on-close” is set to true by default. Which means, if the dbus connection is broken, the program will exit. Usually, such things can only happen on system logout when dbus daemon quits.
For some reason, Vivaldi breaks the dbus connection and then triggers the “exit-on-close” behavior defined in gio. I believe there is a bug in Vivaldi browser that caused this. My guess would be Vivaldi wrongly closed a wrong file descriptor which accidentally belongs to the dbus connection, or some weird interaction between glib mainloop within vivaldi, makes dbus connection think it’s closed. Though I don’t have enough evidence for this claim, but for the following reason:
No other Gtk application (including chromium/chrome which is what vivaldi based on) suffers from same issue.
Vivaldi older version doesn’t have this issue.
Even though the gdbus connection is shared, it is only used by fcitx5, otherwise vivaldi will trigger the issue without using fcitx5, but actually not.
I would say the root cause is that there is a bug in Vivaldi code that closes the dbus connection.
Workaround implemented on fcitx5-gtk side (version 5.0.14)
Even though we think the bug is in Vivaldi, we’d like to avoid such issue from happening. So we applied following workaround:
Use a private dbus connection object for the fcitx dbus client object. While it would use more resource, but in general it is acceptable. And make sure “exit-on-close” is not set on those private dbus connection objects.
Even though we applied (1), we still noticed that dbus connection would be closed by Vivaldi. So we applied workaround (2): always try to recreate the dbus connection object if the original connection is closed. The necessity of (2) also confirms our previous guess of root cause, even if the dbus connection object is only owned by fcitx5 im module, somehow the connection will still be closed when Vivaldi window is closed.
Summary
To user who are affected by the issue, you may upgrade to fcitx5-gtk 5.0.14.
While explaining the technical details and debugging experience is fun, I just want to correct some incorrect understanding on this bug:
The root cause of the bug is not in fcitx5-gtk, and the bug is only triggered by fcitx5-gtk. The bug is in Vivaldi and is NOT fixed and the bug itself may have side effect on Vivaldi’s own code. fcitx5-gtk only implements a workaround to this Vivaldi’s bug.
While implementing workaround is not an ideal solution, we still choose to do that because:
We want to get the actual problem solved for user.
While using a private dbus connection will use a little bit more resource, it may still have some potential benefit from the isolation between the main program that also uses dbus. So workaround part (1) is not entirely a bad idea.