Comparing text in variants of modern Chinese

"I would not start from here if I were you."

That common English quote sums up my feelings about using XeLaTeX for this,
although it does have the ability to specify a range of codepoints which should
use a different font - for LuaLatex you would need to change the font for each
missing codepoint - but in either case the result may not match the rest of the
text.

When I started to prepare my text items for possible use in CJK lipsum files, I
tried to start with the same text for each of Simplified, Traditional, and Hong
Kong Cantonese. But in the end most of the texts I used for the lipsum files
varied greatly between the versions.

So, I went back to the texts I had intended to start with and revised some of
them to use the full equivalent text. At this point I had intended to do two
longish paragraphs, as in the lipsum files. The problem was that I could not find
any combination of items where none of the variants had punctuation at the
end of a line.

Also, I noticed that in XeLaTeX the glyphs were not always aligned as if in a grid
and sometimes the position where a line break occurred differed between fonts.
I now realise that is a (mis)feature of XeLaTeX which appears to be unmaintained.

In the end I moved to using LuaLaTeX, with columns around 23 glyphs wide and
a very few selected items in paragraphs mostly one or two lines long. In the left
hand column I have a few items from the Hitchhiker's Guide to the Galaxy,
sometimes with a full stop between two items.In the right hand column I have
quotes from philosophers. My understanding is that a paragraph in Chinese does
not need to end with a full stop, which helps. Unfortunately, I have now read that
it is bad form to only have one or two Chinese glyphs on a line. Too late, what is
done is done.

If I had to typeset longer paragraphs in Chinese, I assumed it would be possible
to add small spaces between glyphs as necessary to ensure punctuation was not
at the start or end of a line in LuaLateX, but tying for one example failed: I could
force a new line after 22 glyphs, but could not add sufficient space to get close
to the original end position. If I added more space, the 22nd glyph now appeared
on its own on the next line. Lualatex gave no useful information about overfull
horizontal lines.

I have now prepared files for all fonts except the old fireflysung which cannot be
used alongside the fonts from odosung.ttc. My impression is that almost all
codepoints match in the AR PL Kai or AR PL Sung/Ming fonts, and similarly the
codepoints in the Noto fonts mostly match each other. The one noticeable
difference is that CN fonts (Noto sc, Droid Sans Fallback, WenQuanYi Zen Hei)
use low full-stops. That is also true, unexpectedly, for HanWangHeiLight which
is a Traditional font. All other Traditional or Cantonese fonts use mid full-stops.

The TW text from the GenYo fonts differs slightly in the TC and TW fonts (I need
to display these very large to spot the differences), but I did not notice any
difference in the CN or HK text from those fonts.

The files

The individual files are in PDF-Chinese-variants
and the templates are at templates-Chinese-variants.
The lua??.tex files are the variant-specific text, copy to
Chinese-myfilename-??.tex if a new file is needed.

Because CJK fonts are mostly very faint, for recent fonts I tend to use a Medium
weight. Full details of what I used, and the weights, are in the -languages files.
You can also look at the various lipsum files, but for those I have only used the
designated variant of the font (e.g. GenYo fonts only appear in the TW lipsum
files even though they also render SC and HK).

I have the following files (ordered by fontconfig name, with my name second):

Simplified Chinese:
AR PL SungtiL GB 1.009 : gbsn00lp
AR PL KaitiM GB GB 1.006 : gkai00mp
AR PL New Kai 2.005 : odokai
AR PL New Sung 2.006 : odosung
AR PL New Sung Mono 2.007 : odosungmono
AR PL UKai CN 1.010 : UKaiCN
AR PL UKai HK 1.011 : UKaiHK
AR PL UKai TW 1.012 : UKaiTW
AR PL UMing CN 1.013 : UMingCN
AR PL UMing HK 1.014 : UMingHK
AR PL UMing TW 1.015 : UMingTW
Chiron Hei HK 3.034 : ChironHeiHK
Chiron Sung HK 3.036 : ChironSungHK
Droid Sans Fallback 1.028 : Droid Sans Fallback
FandolFang 3.021 : FandolFang
FandolHei 1.029 : FandolHei
FandolKai 3.001 : FandolKai
FandolSong 1.030 : FandolSong
GenYoGothic2 TC 3.032 : GenYoGothic2TC
GenYoGothic2 TW 3.033 : GenYoGothic2TW
GenYoMin2 TC 3.027 : GenYoMin2TC
GenYoMin2 TW 3.028 : GenYoMin2TW
Noto Sans CJK HK 3.020 : NotoSansCJKhk
Noto Sans CJK SC 1.083 : NotoSansCJKsc
Noto Sans CJK TC 1.084 : NotoSansCJKtc
Noto Sans Mono CJK HK 3.038 : NotoSansMonoCJKhk
Noto Sans Mono CJK SC 1.098 : NotoSansMonoCJKsc
Noto Sans Mono CJK TC 1.099 : NotoSansMonoCJKtc
Noto Serif CJK HK 3.019 : NotoSerifCJKhk
Noto Serif CJK SC 3.015 : NotoSerifCJKsc
Noto Serif CJK TC 3.018 : NotoSerifCJKtc
WenQuanYi Zen Hei 1.139 : WenQuanYiZenHei
WenQuanYi Zen Hei Mono 1.140 : WenQuanYiZenHeiMono

Traditional Chinese:
AR PL KaitiM Big5 1.005 : bkai00mp
AR PL Mingti2L Big5 1.007 : bsmi00lp
Chiron Hei HK 3.034 : ChironHeiHK
Chiron Sung HK 3.036 : ChironsungHK
Droid Sans Fallback 1.028 : Droid Sans Fallback
GenYoGothic2 TC 3.032 : GenYoGothic2TC
GenYoGothic2 TW 3.033 : GenYoGothic2TW
GenYoMin2 TC 3.027 : GenYoMin2TC
GenYoMin2 TW 3.027 : GenYoMin2TW
HanWangHeiLight 3.029 : HanWangHeiLight
HanWangWCL02 3.030 : HanWangHeiWCL02
HanWangWCL02 3.031 : HanWangHeiWCL06
Noto Sans CJK HK 3.020 : NotoSansCJKhk
Noto Sans CJK SC 1.083 : NotoSansCJKsc
Noto Sans CJK TC 1.084 : NotoSansCJKtc
Noto Sans Mono CJK HK 3.038 : NotoSansMonoCJKhk
Noto Sans Mono CJK SC 1.098 : NotoSansMonoCJKsc
Noto Sans Mono CJK TC 1.099 : NotoSansMonoCJKtc
Noto Serif CJK HK 3.019 : NotoSerifCJKhk
Noto Serif CJK SC 3.015 : NotoSerifCJKsc
Noto Serif CJK TC 3.018 : NotoSerifCJKtc
AR PL UKai CN 1.010 : UKaiCN
AR PL UKai HK 1.011 : UKaiHK
AR PL UKai TW 1.012 : UKaiTW
AR PL UMing CN 1.013 : UMingCN
AR PL UMing HK 1.014 : UMingHK
AR PL UMing TW 1.015 : UMingTW
WenQuanYi Zen Hei 1.139 : WenQuanYiZenHei
WenQuanYi Zen Hei Mono 1.140 : WenQuanYiZenHeiMono

Hong Kong Cantonese:
Chiron Hei HK 3.034 : ChironHeiHK
Chiron Sung HK 3.036 : ChironSungHK
GenYoGothic2 TC 3.032 : GenYoGothic2TC
GenYoGothic2 TW 3.033 : GenYoGothic2TW
GenYoMin2 TC 3.027 : GenYoMin2TC
GenYoMin2 TW 3.027 : GenYoMin2TW
Noto Sans CJK HK 3.020 : NotoSansCJKhk
Noto Sans CJK SC 1.083 : NotoSansCJKsc
Noto Sans CJK TC 1.084 : NotoSansCJKtc
Noto Sans Mono CJK HK 3.038 : NotoSansMonoCJKhk
Noto Sans Mono CJK SC 1.098 : NotoSansMonoCJKsc
Noto Sans Mono CJK TC 1.099 : NotoSansMonoCJKtc
Noto Serif CJK HK 3.019 : NotoSerifCJKhk
Noto Serif CJK SC 3.015 : NotoSerifCJKsc
Noto Serif CJK TC 3.018 : NotoSerifCJKtc
AR PL UKai CN 1.010 : UKaiCN
AR PL UKai HK 1.011 : UKaiHK
AR PL UKai TW 1.012 : UKaiTW
AR PL UMing CN 1.013 : UMingCN
AR PL UMing HK 1.014 : UMingHK
AR PL UMing TW 1.015 : UMingTW
WenQuanYi Zen Hei 1.139 : WenQuanYiZenHei
WenQuanYi Zen Hei Mono 1.140 : WenQuanYiZenHeiMono