Approved May 8, 2021 by the Technical Committee of Japan DAISY Consortium
This document summarizes discovery requirements for accessible EPUB publications according to user preferences on space as word dividers.
It is hoped that this document helps to introduce word-divider metadata to Schema.org and ONIX and that such word-divider metadata become usable from EPUB Accessibility (ISO/IEC and W3C).
Terms (most notably “discovery”) defined in EPUB Accessibility (ISO/IEC 23761 and W3C Member Submission) and the following terms apply.
Japanese text is written without space as word dividers. That is, there is no space between adjacent Japanese words. For example, consider “私の名前は村田です。”. An English translation is “My name is Murata”. Here “私” means “my”; “名前” means “name”; “村田” means “Murata”; “です” means “is”. But there is no space among these words.
Space as word dividers has been used in textbooks for lower elementary school students. However, It has not been common in trade publishing.
Note: W3C JLreq does not mention the use of space as word dividers, since it is intended to capture traditional trade publishing.
However, some persons with learning disabilities have problems with the absence of space as word dividers. Such persons find it much easier to read “私の 名前は 村田です。”. Space as word dividers is particularly helpful when the text contains consecutive hiragana characters. For example, the same sentence can be written as “わたしのなまえはむらたです。” (without space) or “わたしの なまえは むらたです。” (with space). The latter is much easier to read.
Note: Exactly when should we introduce space? For example, how about “私 の 名前 は 村田 です。”? Here space is introduced three times more. Different camps have different opinions about when space should be introduced. Osaka Medical University is conducting a series of experiments to study which style is most helpful for learning disabilities.
CSS Text Module Level 4 (WD) provides mechanisms for introducing space characters as word dividers. This is done in two steps. In the first step, which is optional, virtual word boundaries are automatically inserted by linguistic analysis. In the second step, these virtual word boundaries as well as the ZERO WIDTH SPACE (U+200B) character and
If the first step is enabled, authors do not have to insert the ZERO WIDTH SPACE (U+200B) character or
Note: The phrase “as explicitly specified by authors” is extremely important. Suppose that an EPUB reading system enables an external CSS stylesheet for invoking the first and second steps mentioned above. Then, any Japanese EPUB publication can be rendered with space as word dividers although space might appear at inappropriate places.
Note: It is possible to create an EPUB publication that can be rendered with space as word dividers and can also be rendered without space as word dividers. Thus, the same EPUB publication may be found by a query for space-divided publications as well as by a query for not-space-divided publications.
Note: An EPUB publication containing space characters as word-dividers cannot be rendered without space as word dividers. -