Professional Documents
Culture Documents
H NI - 2010
Trang
Li ni u ............................................................................................................................................................ 5
Gii thiu ............................................................................................................................................................... 6
Cng ngh thng tin - B k t m ho ch Vit (UCS).................................................................................... 7
1
Phm vi ...................................................................................................................................................... 7
2
Tun th..................................................................................................................................................... 8
2.1
Chung ................................................................................................................................................... 8
2.2
Tun th trao i thng tin ................................................................................................................... 8
2.3
Tun th ca thit b............................................................................................................................. 8
3
Tham chiu qui chun............................................................................................................................... 9
4
Thut ng v nh ngha......................................................................................................................... 10
5
Cu trc chung ca UCS ........................................................................................................................ 17
6
Cu trc v danh mc c s .................................................................................................................. 18
6.1
Cu trc .............................................................................................................................................. 18
6.2
M ho k t........................................................................................................................................ 19
6.3
Kiu im m...................................................................................................................................... 19
6.3.1
Phn loi........................................................................................................................................ 19
6.3.2
K t ho................................................................................................................................... 20
6.3.3
K t nh dng.............................................................................................................................. 20
6.3.4
K t iu khin ............................................................................................................................. 20
6.3.5
K t dng t ................................................................................................................................. 20
6.3.6
Cc im m thay th................................................................................................................... 20
6.3.7
im m phi k t.......................................................................................................................... 21
6.3.8
im m dnh ring...................................................................................................................... 21
6.4
t tn k t ........................................................................................................................................ 21
6.5
Tn gi ngn cho im m ................................................................................................................ 21
6.6
Tn gi dy UCS................................................................................................................................ 22
6.7
Tn gi dy byte................................................................................................................................. 22
7
Sa i v cp nht UCS ....................................................................................................................... 23
8
Tp con .................................................................................................................................................... 23
8.1
Tp con hn ch................................................................................................................................. 23
8.2
Tp con c la............................................................................................................................... 23
9
Dng m ho UCS.................................................................................................................................. 23
9.1
UTF-8.................................................................................................................................................. 23
9.2
UTF-16................................................................................................................................................ 25
9.3
UTF-32 (UCS-4)................................................................................................................................. 25
10
Lc m ho UCS............................................................................................................................. 25
10.1 UTF-8.................................................................................................................................................. 26
10.2 UTF-16BE........................................................................................................................................... 26
10.3 UTF-16LE ........................................................................................................................................... 26
10.4 UTF-16................................................................................................................................................ 26
10.5 UTF-32BE........................................................................................................................................... 26
10.6 UTF-32LE ........................................................................................................................................... 26
10.7 UTF-32................................................................................................................................................ 27
11
Dng chc nng iu khin vi UCS..................................................................................................... 27
12
Khai bo nhn din tnh nng................................................................................................................. 28
12.1 Mc ch v hon cnh ca nhn din ............................................................................................. 28
2
VNPF
VNPF
Li ni u
Trung tm Tiu chun Vit Nam chu trch nhim t chc xt duyt v ngh B
Khoa hc Cng ngh ban hnh tiu chun quc gia. Gip vic cho Trung tm tiu
chun Vit Nam v mt cng ngh thng tin l Ban K thut Cng ngh thng tin.
Ban K thut Cng ngh thng tin c Trung tm Tiu chun ngh, Tng cc Tiu
chun o lng Cht lng cng b quyt nh thnh lp.
Tiu chun quc gia c son tho tng ng vi cc qui tc c Trung tm tiu
chun thit lp.
Tiu chun Vit Nam l nhng tiu chun quc gia c nh nc chnh thc ban
hnh v c hiu lc thi hnh trn ton lnh th Vit Nam.
Cc chun quc gia c chnh thc ban hnh u c tn TCVN - s hiu : nm.
VNPF
Gii thiu
TCVN... : 2010 xc nh cc b k t m ho ch Vit ph dng c s dng trn
lnh th Vit Nam v trn ton th gii. N p dng c cho biu din, truyn, trao
i, x l, a vo v trnh by dng vit ca cc ch vit c dng Vit Nam
cng nh cc k hiu ph.
Bng vic xc nh mt cch nht qun cch m ho a ng tun th theo chun
quc t ISO 10646, chun ny to kh nng cho vic trao i d liu ch Vit trn qui
m quc t. Cho ti nay, cc b k t m ho c xc nh trong TCVN ...: 2010
c chp nhn rng ri trong cc giao thc quc t v c ci t trong cc h
iu hnh hin i v ngn ng my tnh. Chun ny bao qut hn 10 000 k t t
cc b ch c bit ti trn lnh th Vit Nam.
Chun ny l mt phn ca tiu chun quc t ISO 10646:2009 v tun theo mi qui
nh ca ISO 10646. Ni ring ton b phn vn bn ca ISO 10646 c dch v
chuyn sang ting Vit. Ton b phn cc b ch m ho ca Vit Nam c trong ISO
10646 u c a vo chun ny. Ch mt s mc v ph lc khng lin quan ti
cc ch Vit Nam l khng c a vo chun ny, nhng chun ny khng ph
nhn cc phn v vn tng hp vi chng nh trong ISO 10646. Tt c cc b
ch khc c xc nh trong ISO 10646 vn hon ton c gi tr s dng chung vi
cc b ch Vit c xc nh trong chun ny, trong khun kh ca chun quc t
ISO 10646 v Unicode.
VNPF
Phm vi
VNPF
Tun th
2.1
Chung
2.2
b)
c)
2.3
Tun th ca thit b
b)
Thit b ngun gi: Thit b ngun gi s cho php ngi dng cung cp bt
k k t no t mt tp con c chp nhn, v c kh nng truyn cc biu
din m ho ca chng trong phn t d liu CC tng ng vi dng thc m
ho c chp nhn v lc m ho c chp nhn. Nh vy thit b
ngun gi s khng pht i cc phn t d liu CC sai qui cch.
c)
Thit b nhn: Thit b nhn s c kh nng nhn v din gii bt k biu din
m ho no ca k t trong phn t d liu CC tng ng vi dng thc m
ho c chp nhn v lc m ho c chp nhn, v s lm ra bt k k
t tng ng no t tp con c chp nhn sn c cho ngi dng theo
cch ngi dng c th nhn din c chng. Thit b nhn s x l cc phn
t d liu CC sai qui cch nh iu kin li v s khng din gii d liu nh
vy l dy k t.
VNPF
Cc ti liu tham chiu sau y cha cc iu khon m, qua tham chiu trong vn
bn ny, thit lp nn cc iu khon ca TCVN-...:2010. Vi cc tham chiu c ngy
thng, cc tu chnh v sau, hay cc ci bin, bt k xut bn no trong nhng xut bn
ny u khng p dng. Tuy nhin, cc bn ca nhng tho thun da trn ISO/IEC
10646 c khuyn khch nghin cu v kh nng p dng cho cc ln xut bn gn
nht ca ti liu qui chun c ch ra di y. V cc tham chiu c cp nht,
ln xut bn mi nht ca ti liu c tham chiu s p dng.
ISO/IEC 2022:1994, Information technology Character code structure and
extension techniques.
ISO/IEC 6429:1992, Information technology Control functions for coded character
sets.
Unicode Standard Annex, UAX#9, The Unicode Bidirectional Algorithm, Version 6.0.0.
Unicode Standard Annex, UAX#15, Unicode Normalization Forms, Version 6.0.0.
Unicode Standard Annex, UAX#34, Unicode Named Character Sequences, Version
6.0.
Unicode Technical Standard, UTS#37, Ideographic Variation Database, Version 1.0,
January 2006.
Unicode Standard Annex, UAX#44, Unicode Character Database, Version 6.2.
VNPF
Thut ng v nh ngha
4.1
K t c s
K t ho khng phi l k t t hp.
Lu - Phn ln cc k t ho u l k t c s. Ngha ny ca t hp ho khng ngn cn
vic trnh by k hiu c s t cc dng ng cnh khc hay t vic tham gia vo nt ch.
4.2
Mt phng a ng c s , BMP
Mt phng 00 ca Nhm 00.
4.3
Khi
Mt min lin tc cc im m c cp cho mt tp cc k t c cc c trng
chung, nh mt b ch; khi n khng chm lp ln khi khc, mt hay nhiu im
m bn trong khi c th khng c k t c cp cho chng.
4.4
Biu din chnh tc
Biu din m vi n cc k t ca b k t m ho ny c xc nh dng bn im
m bn trong khng gian m UCS.
4.5
Phn t d liu CC , Phn t d liu k t m ho , Dy n v m
Phn t thng tin c trao i, c xc nh bao gm mt dy cc n v m,
tng ng vi mt hay nhiu chun c nhn din cho cc tp k t m ho; dy
nh vy c th cha cc n v m lin kt vi bt k kiu im m no.
4.6
K t
Thnh vin ca mt tp cc phn t c dng t chc, iu khin hay biu din
d liu vn bn; mt k t c th c biu din bng mt dy mt hay nhiu k t m
ho.
4.7
Bin k t
Trong mt phn t d liu CC c gii hn gia n v m cui cng ca mt k t
m ho v n v m ho u tin ca k t m ho tip sau.
10
VNPF
4.8
S m, Bng m
Bng ch nht ch ra biu din ca cc k t m ho c cp pht bn trong min
ca khng gian m UCS.
4.9
K t m ho
Lin kt gia mt k t v mt im m.
4.10
Tp k t m ho
Tp cc k t c m ho.
4.11
im m, V tr m
Bt k gi tr no trong khng gian m UCS; thut ng im m c a chung hn.
4.12
n v m
T hp bit ti thiu c th biu din mt n v ca vn bn c m ho dnh cho
x l hay trao i.
LU - V d v cc n v m l byte (n v m 8 bit) c dng trong dng m ho UTF-8, cc
n v m 16 bit trong dng m ho UTF-16, v cc n v m 32 bit trong dng m ho UTF-32.
4.13
Tuyn tp
Tp cc thc th c nh s v t tn; vi mt tuyn tp khng m rng, cc
thc th ny ch bao gm nhng k t m ho c im m nm bn trong mt hay
nhiu min c nhn din (xem 4.24 v tuyn tp m rng).
LU Nu bt k min c nhn din no bao gm cc im m m khng k t no c cp
pht, kho ca tuyn tp ny s thay i nu mt k t b sung c gn cho bt k im m no
trong cc im m vic iu chnh tng lai ca Chun quc gia ny. Tuy nhin iu c d
nh l s hiu v tn gi tuyn tp s vn cn khng i trong cc ln bin tp tng lai ca
Chun quc gia ny.
4.14
K t t hp
Cc k t c gi tr Phn loi chung ca Du t hp dn cch (Mc), Du khng dn
cch (Mn), v Du bao (Me) tng ng vi c s d liu k t Unicode (xem 3).
LU Cc k t ny c d nh cho vic t hp vi k t ho khng t hp ng trc
hay vi dy cc k t t hp c ng trc bi k t khng t hp (xem 4.17).
VNPF
11
4.15
Lp t hp
Gi tr lin kt vi tng k t t hp xc nh tng tc loi hnh ca n v th t
chnh tc ca n bn trong dy cc k t t hp.
4.16
K t tng hp
K t ho c bao hm trong tp k t m ho ca TCVN...: 2010 ch yu dnh
cho vic tng hp vi cc k t m ho c.
4.17
Dy hp thnh
Dy cc k t ho bao gm k t c s theo sau bi mt hay nhiu k t t hp,
ZERO WIDTH JOINER, hay ZERO WIDTH NON-JOINER (cng xem c 4.14).
LU 1 K hiu ho cho mt dy hp thnh ni chung bao gm t hp ca cc k hiu
ho ca tng k t trong dy.
LU 2 Dy hp thnh c th c dng biu din cc k t khng c m ho trong kho
ch ca TCVN...: 2010.
4.18
K t iu khin
Chc nng iu khin c biu din m ho biu th bng mt im m.
LU Mc du k t iu khin thng "c t tn" bng cc thut ng nh DELETE, FORM
FEED, ESC, nhng lng t ny khng tng ng vi tn k t chnh thc. Xem 11 v danh sch
cc tn di c dng bi ISO/IEC 6429 trong lin kt vi cc k t iu khin.
4.19
Chc nng iu khin
Mt hnh ng nh hng ti vic ghi, x l, truyn, hay din gii d liu, v iu
c biu din bi mt phn t d liu CC.
4.20
Trng thi mc nh
Trng thi c gi nh khi khng trng thi no c xc nh tng minh.
4.21
Thit b
Mt cu phn ca thit b x l thng tin m c th truyn v/hoc nhn thng tin m
ho bn trong cc phn t d liu CC. (N c th l thit b vo/ra theo ngha qui c,
hay qui trnh nh chng trnh ng dng hay chc nng ca khu.)
4.22
Dng m ho
12
VNPF
4.23
Lc m ho
Lc m ho xc nh cch chui ho cc n v m t dng m ho thnh tng
byte.
LU Mt s cc lc m ho UCS c cng nhn nh dng m ho UCS. Tuy nhin chng
c dng trong hon cnh khc nhau. Cc dng m ho UCS ni ti biu din trong b nh v
giao din ng dng ca d liu vn bn. Cc lc m ho UCS ni ti d liu vn bn chui
ho theo byte.
4.24
Tuyn tp m rng
Tuyn tp theo cc thc th cng c th bao gm dy cc im m dng chun
ho NFC (xem 21); dy cc im m c tham chiu ti bi Danh nh dy UCS c
tn (NUSI) (xem 12.5).
LU Mt s tuyn tp nh 3 LATIN EXTENDED-A, 4 LATIN EXTENDED-B, 15 ARABIC
EXTENDED v nhiu na c thut ng "extended" trong tn ca chng. iu ny khng lm cho
chng thnh tuyn tp m rng.
4.25
Tuyn tp c nh
Tuyn tp trong mi im m bn trong cc min c nhn din u c k t
c cp pht cho n, v k t ny c d nh vn cn khng i trong cc ln
bin tp tng lai ca Chun ny.
4.26
K t nh dng
K t c chc nng chnh l nh hng ti vic b tr hay x l k t quanh n; ni
chung n khng c biu din thy c ca ring n.
4.27
Phn loi chung, GC
Gi tr c gn cho tng im m UCS, xc nh ra lp chnh ca n, nh ch ci,
du ngt, v k hiu; tng gi tr u c xc nh nh cch vit tt hai ch ci trong
C s d liu Unicode (xem 3).
LU Khi c tham chiu nh mt nhm tt c cc gi tr GC c chung cng ch ci u tin,
nhm ny c th c m t bng vic ch dng ch ci u tin ny. Chng hn, 'L' vit tt cho tt
c cc ch ci 'Lu', 'LI', 'Lt', 'Lm', v Lo'.
4.28
K t ho
Mt k t, khc chc nng iu khin hay k t nh dng, c cch biu din trc quan
VNPF
13
4.29
K hiu ho
Cch biu din trc quan ca k t ho hay ca dy hp thnh.
4.30
im m thay th cao
im m trong min D800 ti DBFF c dnh ring dng UTF-16.
4.31
n v m thay th cao
n v m 16-bit trong min D800 ti DBFF c dng trong UTF-16 nh n v m
i u ca cp thay th (xem 9.2).
4.32
Phn t d liu CC c lp km
Phn t d liu CC ca UCS ng trong dng m ho UCS m khng tun th c
t ca dng m ho (chng hn, n v m thay th khng c cp l mt phn t
d liu m CC c lp km).
4.33
Tp con phn t d liu CC c lp km
Tp con khc rng ca phn t d liu CC X khng cha n v m no m thuc
vo tp con phn t d liu CC c lp tt ti thiu ca X.
LU Tp con phn t d liu CC c lp km khng th chm lp ln phn t d liu CC
c lp tt.
4.34
Trao i ln nhau
Vic truyn d liu m ho k t t ngi dng ny sang ngi dng khc, dng
phng tin vin thng hay phng tin trao i ln nhau c; trao i ln nhau ng
tun t ho d liu v dng lc m ho UCS.
4.35
Lm vic ln nhau
Qui trnh cho php hai hay nhiu h thng, tng h thng s dng cc tp k t m
ho khc nhau, trao i c ngha cc d liu m ho k t; c th bao gm vic
chuyn i gia hai b m.
4.36
ISO/IEC 10646-1
c tham chiu ti l Phn 1 ca ISO/IEC 10646 v cha c t v kin trc tng
14
VNPF
4.37
ISO/IEC 10646-2
c tham chiu ti l Phn 2 ca ISO/IEC 10646 v cha c t v Mt phng a
ng b sung (SMP), Mt phng ch biu b sung (SIP) v Mt phng chuyn dng
b sung (SSP). Ch c ln xut bn th nht ISO/IEC 10646-2.
4.38
im m thay th thp
im m trong min DC00 ti DFFF c dnh ring cho vic dng UTF-16.
4.39
n v m thay th thp
n v m 16-bit trong min DC00 ti DFFF c dng trong UTF-16 nh n v m
i sau ca cp thay th (xem 9.2)
4.40
Phn t d liu CC c lp tt ti thiu
Phn t d liu CC c lp tt nh x ti mt gi tr v hng UCS
4.41
K t soi gng
K t c hnh nh c soi gng theo chiu ngang trong vn bn c b tr t phi
sang tri
4.42
Byte
n v m 8-bit; gi tr c biu din theo k php thp lc phn t 00 ti FF trong
UCS (xem Ph lc K)
4.43
Mt phng
Vic phn chia con ca khng gian m UCS cha 65536 im m. Khng gian m
UCS cha 17 mt phng.
4.44
Vic trnh by; trnh by
Qui trnh vit, in hay hin th k hiu ho.
4.45
Dng trnh by
VNPF
15
4.46
Mt phng s dng t
Mt phng bn trong tp k t m ho ny; ni dung ca n khng c xc nh
trong TCVN-...:2010. Mt phng 0F v 10 l mt phng s dng t
4.47
Kho
Mt tp xc nh cc k t c biu din trong tp cc k t m ho
4.48
Hng
Vic phn chia nh mt mt phng; bng bi s ca 256 im m
4.49
Ch vit
Tp cc k t ho dnh cho dng vit ca mt hay nhiu ngn ng
4.50
Mt phng b sung
Mt phng khc Mt phng 00 ca khng gian m UCS; mt phng cha cc k t
khng c cp pht cho Mt phng a ng c s
4.51
Mt phng a ng b sung cho cc ch vit v k hiu , SMP
Mt phng 01 ca khng gian m UCS
4.52
Mt phng ch biu b sung , SIP
Mt phng 02 ca khng gian m UCS
4.53
Mt phng chuyn dng b sung , SSP
Mt phng 0E ca khng gian m UCS
4.54
Cp thay th
Mt biu din cho ring mt k t c cha dy hai n v m 16-bit, vi gi tr th nht
ca cp l n v m thay th cao v gi tr th hai l n v m thay th thp
16
VNPF
4.55
Mt phng ch biu th ba TIP
Mt phng 03 ca khng gian m UCS
4.56
Khng gian m UCS
Khng gian m UCS bao gm cc s nguyn t 0 ti 10FFFF (h thp lc phn) sn
c cho vic gn kho cc k t UCS
4.57
Gi tr v hng UCS
Bt k im m UCS no ngoi tr cc im m thay th cao v thay th thp
4.58
n v m thay th khng theo cp
n v m thay th trong phn t d liu CC m hoc l n v m thay th cao
khng c theo sau ngay n mt n v m thay th thp, hay n v m thay th thp
khng c trc n ngay mt n v m thay th cao
4.59
Ngi dng
Ngi hay thc th khc gi ti dch v c mt thit b cung cp (Thc th ny c
th l mt qui trnh nh chng trnh ng dng nu "thit b" l b chuyn m hay mt
chc nng ca khu, chng hn.)
4.60
Phn t d liu CC c lp tt
Phn t d liu CC ca UCS c ng trong dng m ho UCS tun th theo c t
ca dng m ho v khng cha tp con phn t d liu CC c lp km
VNPF
17
Mt phng ch biu th ba (TIP, Mt phng 03) c dnh ring v hin thi ang
trng. Cc mt phng t 04 ti 0D c dnh ring cho vic chun ho tng lai.
Mt phng 0F v 10 c dnh cho s dng t.
Cc tp con v khng gian m ho c th c dng cho kho con cc k t
ho.
Cu trc v danh mc c s
6.1
Cu trc
Mt phng s dng t
Mt phng s dng t
(Mt phng 3)
0000
(Mt phng 2)
(Mt phng 1)
(Mt phng 0)
0080
00FF
D7FF
D800..DFFF
Vng thay th
E000..F8FF
Vng s dng t
F900-FFFF
18
VNPF
6.2
M ho k t
Tng k t bn trong khng gian m UCS c biu din bng mt s nguyn gia 0
v 10FFF c nhn din nh im m.
Khi mt k t c nhn din di dng n v m ca n, n c biu din bng
mt s nguyn c dng su ch s nh
000030
000041
010000
6.3
Kiu im m
M t vn
tt
Phn
loi
chung
ho
Ch ci,
du hiu,
s, ngt
cu, k
hiu v
du cch
L,M,
N, P,
S, Zs
VNPF
Trng
thi k
t
Trng
thi
im
m
phn
b
cho k
t
im
m
c
phn
b
Cc
Co
Dnh
ring vnh
vin cho
UTF 16
Cs
19
Cn
Dnh ring
Dnh
ring cho
phn b
tng lai
Khng
phn
b
cho k
t
im
m
khn
g
phn
b
6.3.2 K t ho
Cng k t ho s khng c phn b cho qu mt im m. C nhng k t
ho vi hnh dng tng t trong tp k t m ho; chng c dng cho cc mc
ch khc nhau v c cc tn k t khc nhau.
6.3.3 K t nh dng
Cc im m 2060 ti 206F, FFF0 ti FFFC, v E0000 ti E0FFF c dnh ring
cho K t nh dng (xem 16.3 v Ph lc F).
LU Cc im m khng c phn b trong nhng min ny c th c b qua trong x l
v hin th thng thng.
6.3.4 K t iu khin
Cc im m 0000 ti 001F, 007F ti 009F trong BMP c dnh ring cho cc k t
iu khin (xem 11).
6.3.5 K t dng t
Cc im m t E000 ti F8FF trong BMP c dnh ring cho vic s dng t. Tt
c cc im m ca Mt phng 0F v Mt phng 10, ngoi tr FFFFE, FFFFF,
10FFFE, v 10FFFF c dnh ring cho vic dng t.
K t dng t khng b TCVN-...:2010 rng buc theo bt k cch no. K t dng t
c th c dng cung cp cc k t do ngi dng xc nh. Chng hn, y l
yu cu thng thng cho ngi dng ch vit biu .
LU trao i c ngha cc k t dng t, mt tho thun, c lp vi TCVN-...: 2010 l cn
thit gia ngi gi v ngi nhn.
6.3.6 Cc im m thay th
Cc im m D800 ti DFFF c dnh ring cho vic dng dng m ho UTF-16
(xem 9.2). Na th nht (D800 ti DBFF) cha cc im m thay th cao v na th
hai (DC00 ti DFFF) cha cc im m thay th thp.
20
VNPF
6.3.7 im m phi k t
Trng thi ca im m phi k t khng th b thay i bi nhng sa i tng lai.
Cc phi k t bao gm FDD0-FDEF v bt k im m no kt thc bi gi tr FFFE
hay FFFF.
LU im m FFFE c dnh ring cho du hiu. im m FDD0 ti FDEF, v FFFF c
th c dng cho vic dng x l ni b yu cu gi tr s c m bo khng phi l k t c
m ho, nh trong bng kt thc, hay bo hiu ht vn bn. Hn na, v FFFF l gi tr BMP ln
nht, n cng c th c dng nh gi tr cui cng trong ch s tm kim nh phn hay tun t
trong ng cnh ca UTF-16.
6.4
t tn k t
6.5
Tn gi ngn cho im m
VNPF
21
+017F
U017F
U+017F
6.6
Tn gi dy UCS
6.7
Tn gi dy byte
22
VNPF
Sa i v cp nht UCS
Tp con
8.1
Tp con hn ch
8.2
Tp con c la
Dng m ho UCS
9.1
UTF-8
23
Byte th nht
Byte th hai
Byte th ba By th t
00000000
0xxxxxxx
0xxxxxxx
00000yyy
yyxxxxxx
110yyyyy
10xxxxxx
zzzzyyyy yyxxxxx
1110zzzz
10yyyyyy
10xxxxxx
000uuuuu
zzzzyyyy yyxxxxxx
11110uuu
10uuzzzz
10yyyyyy
10xxxxxx
Byte th nht
Byte th hai
Byte th ba By th t
0000-007F
00-7F
0080-07FF
C2-DF
80-BF
0800-0FFF
E0
A0-BF
80-BF
1000-CFFF
E1-EC
80-BF
80-BF
D000-D7FF
ED
80-9F
80-BF
E000-FFFF
EE-EF
80-BF
80-BF
10000-3FFFF
F0
90-BF
80-BF
80-BF
40000-FFFFF
F1-F3
80-BF
80-BF
80-BF
100000-10FFFF
F4
80-8F
80-BF
80-BF
24
VNPF
9.2
UTF-16
UTF-16
xxxxxxxxxxxxxxxxx
9.3
UTF-32 (UCS-4)
10
Lc m ho UCS
VNPF
25
10.1 UTF-8
Lc m ho UTF-8 tun t ho mt n v m UTF-8 theo ch xc cng th t
nh bn thn dy n v m.
Khi c biu din trong UTF-8, du hiu bin thnh dy byte <EF BB BF>. Vic s
dng ca n ch bt u lung d liu UTF-8 c cn ti hay khuyn co nhng
khng nh hng ti s tun th.
10.2 UTF-16BE
Lc m ho UTF-16BE tun t ho phn t d liu CC UTF-16 theo cch byte
c ngha hn i trc byte km ngha hn (cng cn c bit ti nh l trt t
u ln).
Trong UTF-16BE, dy byte khi u ca <FE FF> c din gii l FEFF ZERO
WIDTH NO-BREAK SPACE v khng truyn ngha ca du hiu.
10.3 UTF-16LE
Lc m ho UTF-16LE tun t ho phn t d liu CC UTF-16 theo sp th t
cc byte theo cch byte t ngha hn i trc by nhiu ngha hn (cng c bit
ti nh th t u b).
Trong UTF-16LE, dy byte khi u ca <FF FE> c din gii l FEFF ZERO
WIDTH NO-BREAK SPACE v khng truyn t ngha du hiu.
10.4 UTF-16
Lc m ho UTF-16 tun t ho mt phn t d liu CC UTF-16 bng vic sp
th t cc byte theo cch hoc byte c ngha t i trc hay theo sau byte c ngha
hn.
Trong lc m ho UTF-16, du hiu khi u c c l <FE FF> ch ra rng
byte c ngha hn i trc byte t ngha hn, cn <FF FE> ch ra iu ngc li.
Du hiu ny khng phi l mt phn ca d liu vn bn.
Nu khng c du hiu, th t byte ca lc m ho UTF-16 l byte c ngha
hn i trc byte t ngha hn.
10.5 UTF-32BE
Lc UTF-32BE tun t ho mt phn t d liu CC bng vic sp th t cc
byte theo cch byte c ngha hn i trc byte t ngha hn (cng cn c bit
nh th t u ln).
Trong UTF-32BE, dy byte khi u <00 00 FE FF> c din gii l FEFF ZERO
WIDTH NOBREAK SPACE v khng truyn t ngha du hiu.
10.6 UTF-32LE
Lc UTF-32LE tun t ho mt phn t d liu CC bng vic sp th t cc byte
theo cch byte t ngha hn i trc byte nhiu ngha hn (cng cn c bit
nh th t u b).
Trong UTF-32BE, dy byte khi u <FE FF 00 00> c din gii l FEFF ZERO
26
VNPF
10.7 UTF-32
Lc m ho UTF-32 tun t ho dy n v m UTF-32 bng vic sp th t cc
byte theo cch hoc byte t ngha hn i trc hay i sau byte nhiu ngha hn.
Nu khng c du hiu, th t byte ca lc m ho UTF-32 l byte c ngha
hn i trc byte t ngha hn.
11
0000 NULL
0005 ENQUIRY
VNPF
27
0084 INDEX
0007 BELL
0008 BACKSPACE
12
28
VNPF
VNPF
29
ng nht vi tp C0
ESC 02/02 F
ng nht vi tp C1
30
VNPF
13
K hiu ho c xem nh biu din trc quan in hnh ca cc k t. TCVN....:2010 khng nh m t trc ch xc hnh dng ca tng k t. Hnh dng ny b
nh hng bi thit k ca font c s dng, iu bn ngoi phm vi ca TCVN....:2010.
Cc k t c xc nh trong TCVN-....:2010 c nhn din duy nht theo tn ca
chng. iu ny khng ng rng k hiu ho m chng thng c to nh bao
gi cng khc nhau. V d v k t ho vi cc k hiu ho tng t l LATIN
CAPITAL LETTER A, GREEK CAPITAL LETTER ALPHA v CYRILLIC CAPITAL
LETTER A.
Ngha c gn cho mi k t u khng c TCVN-....:2010 xc nh; n c th
khc gia cc loi ch vit, hay gia cc ng dng.
Vi cc b ch theo ch ci, nguyn tc chung l sp xp cc k t bn trong hng
theo trnh t ch ci xp x; nhng b ch c ch hoa v ch thng, cc ch ny
c sp theo cp. Tuy nhin, nguyn tc chung ny trong vi trng hp cng b b
qua. Chng hn, vi cc b ch c trong cc chun c lin quan, cc k t c
cp pht tng ng theo chun . Vic b tr ny bn trong s m s h tr cho
chuyn i gia cc chun hin c v tp k t m ho ny. Tuy nhin, ni chung
ngi ta d kin rng chuyn i gia tp k t m ho ny v bt k tp k t m ho
no khc s dng k thut bng tra.
iu khng c d nh, m cng khng thng xy ra, l cc k t c cn ti
bi bt k ngi dng no s c tm thy tt c c gp nhm cng nhau trong
mt phn ca s m ny.
Hn na, ngi dng bt k b ch no s thy rng cc k t c cn ti c th
c m ho u trong tp k t m ho ny. iu ny c bit p dng cho
ch s, cho k hiu, v cho vic dng cc ch Latin trong cc ng dng bng ch
kp.
Do , trong khi dng tp k t m ho ny, c gi c khuyn nn tham chiu ti
danh sch cc tn khi trong tng quan v cc mt phng trong hnh 3 ti 7, v ri
quay sang s m ring cho b ch lin quan v cho cc k hiu v ch s.
14
Tn khi v tuyn tp
14.1 Tn khi
Cc khi c tn cc im m lin tc c xc nh bn trong mt mt phng vi mc
ch cp pht cc k t c chung mt s c trng chung, nh mt b ch vit. Cc
khi c xc nh bn trong BMP, SMP, SIP v SSP c minh ho trong cc hnh
2 ti 6.
Qui tc c dng xy dng cc tn khi c nu trong 24.4.1.
VNPF
31
14.2 Tn tuyn tp
Qui tc c dng xy dng tn ca tuyn tp c nu trong 24.4.2.
15
16
Cc k t c bit
16.1 K t du cch
Cc k t sau y l cc k t du cch. Chng biu din cho mi k t c gi tr phn
loi chung c t l Zs.
im m
32
Tn
0020
SPACE
VNPF
NO-BREAK SPACE
2006
SIX-PER-EM SPACE
1680
2007
FIGURE SPACE
180E
MONGOLIAN VOWEL
SEPARATOR
2008
PUNCTUATION SPACE
2009
THIN SPACE
200A
HAIR SPACE
202F
205F
MEDIUM MATHEMATICAL
SPACE
3000
IDEOGRAPHIC SPACE
2000
EN QUAD
2001
EM QUAD
2002
EN SPACE
2003
EM SPACE
2004
THREE-PER-EM SPACE
2005
FOUR-PER-EM SPACE
16.3 K t nh dng
Cc k t sau y l cc k t nh dng (xem 6.3.3). Chng biu din cho tt c cc
k t c gi tr Phn loi chung c t l Cf, Zl, v Zp. Xem ph lc F.
im m Tn
200B
00AD
SOFT HYPHEN
200C
0600
200D
0601
200E
LEFT-TO-RIGHT MARK
0602
200F
RIGHT-TO-LEFT MARK
0603
2028
LINE SEPARATOR
06DD
2029
PARAGRAPH SEPARATOR
070F
202A
LEFT-TO-RIGHT EMBEDDING
17B4
202B
RIGHT-TO-LEFT EMBEDDING
17B5
202C
1A60
202D
LEFT-TO-RIGHT OVERRIDE
1CBF
202E
RIGHT-TO-LEFT OVERRIDE
VNPF
33
WORD JOINER
2061
FUNCTION APPLICATION
2062
INVISIBLE TIMES
2063
INVISIBLE SEPARATOR
2064
INVISIBLE PLUS
206A
206B
ACTIVATE SYMMETRIC
SWAPPING
206C
206D
FFFA
INTERLINEAR ANNOTATION
SEPARATOR
FFFB
INTERLINEAR ANNOTATION
TERMINATOR
110BD
1D173
1D174
1D175
1D176
1D177
1D178
206E
1D179
206F
1D17A
2D7F
E0001
LANGUAGE TAG
FEFF
E0020-E007F
FFF9
INTERLINEAR ANNOTATION
ANCHOR
16.4 K t m t ch biu
K t m t ch biu - Ideographic Description Character (IDC) l mt k t ho,
c dng vi mt dy cc k t ho khc to nn Dy m t ch biu Ideographic Description Sequence (IDS). Dy nh vy c th c dng m t k
t ch biu khng c xc nh bn trong chun ny. Ph lc I m t chng chi
tit hn. Danh sch IDC l nh sau:
im m
Tn
2FF0
2FF1
2FF2
2FF3
2FF4
2FF5
2FF6
2FF7
2FF8
2FF9
2FFA
2FFB
VNPF
Danh sch sau y cung cp m t v s xut hin bin thin tng ng vi vic
dng b la bin th thch hp vi mi k hiu ton hc c s c php.
LU 3 VARIATION SELECTOR-1 l b la bin th duy nht c dng cho cc k hiu ton
hc.
Dy
(k php UID)
<2229, FE00>
<222A, FE00>
<2268, FE00>
<2269, FE00>
<2272, FE00>
<2273, FE00>
<228A, FE00>
<228B, FE00>
<2293, FE00>
<2294, FE00>
<2295, FE00>
<2297, FE00>
<229C, FE00>
<22DA, FE00>
<22DB, FE00>
<2A3C, FE00>
<2A3D, FE00>
<2A9D, FE00>
SIMILAR OR LESS-THAN with similar following the slant of the upper leg
VNPF
35
SIMILAR OR GREATER-THAN with similar following the slant of the upper leg
<2AAC, FE00>
<2AAD, FE00>
<2ACB, FE00>
<2ACC, FE00>
17
Dng trnh by ca cc k t
18
K t tng hp
36
VNPF
19
Th t ca cc k t
20
K t t hp
20.1 Th t ca k t t hp
Cc biu din m ho ca cc k t t hp s i theo sau k t ho m chng c
lin kt (chng hn, biu din m ca LATIN SMALL LETTER A c i theo sau bi
COMBINING TILDE biu din cho dy hp thnh ch Latin ).
Nu mt k t t hp c coi nh dy hp thnh theo quyn ring ca n, n s
c m ho nh dy hp thnh bi vic lin kt vi k t 00AD NO-BREAK SPACE.
Chng hn, du thanh huyn c th c to ra khi 00AD NO-BREAK SPACE c i
theo sau bi 0300 COMBINING GRAVE ACCENT.
20.2 Lp t hp v sp th t chnh tc
Tng k t t hp u c gi tr lp t hp c xc nh bi c s d liu Unicode
(xem 3). Lp t hp c dng xc nh th t chnh tc m l mt phn ca qu
trnh chun ho (xem 21). Th t chnh tc bao gm sp th t cc k t t hp theo
th t tng ca lp t hp ca chng. Cc k t t hp c gi tr lp t hp l khng
l khng c quan h sp th t li i vi cc k t khc.
37
20.5 a k t t hp
C nhng trng hp nhiu hn mt k t t hp c p dng cho mt k t ho.
TCVN...:2010 khng hn ch s cc k t t hp m c th i sau k t c s. Cc qui
tc sau p dng cho vic trnh by cc k t ny:
a) Nu k t t hp c th tng tc trong trnh by (chng hn, COMBINING
MACRON v COMBINING DIAERESIS), th v tr ca k t t hp trong hin th
ho kt qu c xc nh bi th t ca biu din m ho ca cc k t t hp.
Cc trnh by ca k t t hp c nh v t k t c s hng ra ngoi. Chng
hn, cc k t t hp c t ln trn k t c s c chng ln theo chiu
ng, bt u vi k t u tin c gp trong dy cc biu din m ho v tip
tc cho nhiu du trn nh c yu cu bi cc k t t hp m ho i sau k
t c s m ho. Vi cc k t t hp c t bn di k t c s, tnh hung
o ngc li, vi cc k t t hp bt u t k t c s v chng ngc xung.
Mt v d v a k t t hp trn k t c s c thy trong ch Thi, ni ch
ph m c th c trn n mt hay nhiu nguyn m 0E34 ti 0E37 v, trn ,
mt trong bn du thanh 0E48 ti 0E4B. Th t ca biu din m ho l: ph m
c s, i sau bi mt nguyn m, i sau bi mt du thanh.
b) Mt s k t t hp ring ghi ln hnh vi xp chng mc nh bng vic nh v
theo chiu ngang thay v chiu chng ln, hay bng vic hnh thnh nt ch vi k
t t hp lin k. Khi c nh v theo chiu ngang, th t ca cc biu din m
ho c phn nh bi vic nh v theo th t chi phi ca ch vit m chng
c dng. Chng hn, du ging chiu ngang trong ch vit tri qua phi c
m ho tri qua phi.
Cc k t ni bt ch ra hnh vi ghi nh vy c lin kt vi cc ch vit hay
bng ch ring. Chng hn, COMBINING GREEK KORONIS (0343) yu cu
rng, cng vi mt du sc hay huyn, chng c ti to ngay bn cnh ch,
thay v du ging c chng ln trn COMBINING GREEK KORONIS. Th t
ca cc biu din m ho l: bn thn ch , tip sau l du bt hi, tip sau l
du ging. Hai thanh ting Vit c cng s xut hin ho nh du ging sc v
huyn latin khng chng ln trn ba ch nguyn m Vit Nam m cha du
ph m (, , ). Thay v th, chng to nn nt ch cng cu phn m ca ch
nguyn m.
c) Nu cc k t t hp khng tng tc trong trnh by (chng hn, khi mt k t t
hp trn mt k t ho v k t khc di), k hiu ho kt qu t k t
c s v k t t hp cc th t khc nhau c th xut hin nh nhau. Chng
hn, biu din m ho ca LATIN SMALL LETTER A, theo sau bi COMBINING
CARON, theo sau bi COMBINING OGONEK c th lm pht sinh cng k hiu
ho nh biu din m ho ca LATIN SMALL LETTER A, theo sau bi
38
VNPF
20.7 B ni t v t hp
K t 034F COMBINING GRAPHEME JOINER c dng ch ra rng k t k l
c x l nh mt n v vi mc ch sp xp v tm kim nhy cm ngn ng.
Trong sp xp v tm kim nhy cm ngn ng, b ni t v t hp nn c b qua
tr phi n xut hin c bit vi nh x phn t th t c iu chnh. ti to, b
ni t v t hp l v hnh.
LU 1 B ni t v t hp c th c dng lm khc bit hai cch dng ca k t t hp
bng vic dng n cho mt trong hai trng hp. Chng hn, ni cn ti phn bit gia umlaut ca
c v trma, COMBINING GRAPHEME JOINER (034F) theo sau bi COMBINING DIAERESIS
(0308) nn c dng biu din cho trma trong khi COMBINING DIAERESIS (0308) mt mnh
nn c dng biu din cho umlaut c.
21
Dng chun ho
Dng chun ho l c ch cho php chn la mt cch biu din m ho duy nht
trong nhiu cch biu din, nhng cc biu din vn bn m ho u l tng ng
ca cng mt vn bn. Dng chun ho cho vic dng vi TCVN...:2010 c xc
nh trong Chun Unicode UAX#15 (xem mc 3).
LU 1 Theo nh ngha, kt qu ca vic p dng bt k dng no trong cc dng chun ny
u n nh qua thi gian. N ngha l biu din c chun ho ca vn bn vn cn c chun
ho khi chun ny c tu chnh.
LU 2 Mt s dng chun ho thin v dy hp thnh i vi cc biu din ngn hn ca vn
bn, mt s khc thin v cc biu din ngn hn. Yu cu tng hp ngc c cung cp bng
vic thit lp TCVN...:2010 nh phin bn tham chiu cho nh ngha v cc biu din ngn hn
ca vn bn. Vic thng nht kho ca chng l ng nht vi tuyn tp c nh UNICODE 3.2.
LU 3 Mc ch ca chun ho l cung cp mt kt qu c chun ho duy nht cho bt k
dy vn bn cho no to iu kin, trong s nhng iu khc, nhn din vic i snh. Dng
chun ho khng nht thit biu din dy ti u t quan im ngn ng.
22
Tn k t v ch gii
22.1 Tn thc th
Chun ny xc nh tn cho cc kiu thc th sau y
VNPF
39
k t
danh nh dy UCS c tn (xem 25)
khi (xem 14)
tuyn tp
SPACE,
HYPHEN-MINUS, v
FULL STOP c th xut hin ch gia hai k t ch-s (LATIN CAPITAL LETTER A
ti LATIN CAPITAL LETTER Z, DIGIT ZERO ti DIGIT NINE) trong tn tuyn tp.
V D 2 Tn tuyn tp sau y cha FULL STOP gia hai ch s, DIGIT FOUR v DIGIT ONE:
UNICODE 4.1
V D 3 Tn tuyn tp sau y cha FULL STOP gia mt ch Latin, LATIN CAPITAL LETTER
D, v mt ch s, DIGIT SEVEN: BMP-AMD.7
22.3 Tn n
Tng thc th c tn trong chun ny s c cho duy nht mt tn.
LU iu ny khng ngn cn vic dng thng tin ca tn bit hiu hay vit tt vi mc hc
lm sng t. Tuy nhin, thc th qui chun s l duy nht.
40
VNPF
Tn khi to nn mt khng gian tn. Tng tn khi s l duy nht v phn bit vi
cc tn khi khc c xc nh trong chun ny.
22.4.2 Tn tuyn tp
Tn tuyn tp thit lp nn khng gian tn. Tng tn tuyn tp s l duy nht v phn
bit vi mi tn tuyn tp khc c xc nh trong chun ny.
22.4.3 Tn k t v danh nh dy UCS c tn
VNPF
41
23
Cu trc ca Mt phng a ng c s
VF
CHM
CJK CF
Samll Form Variants
Arabic Presentation Forms-B
Halfwidth and Fullwidth Form
Sp.
42
VNPF
Control
Basic latin
Control
Latin-1 Supplement
Latin Extended A
Latin Extended B
Latin Extended B
IPA Extension
Spacing Modifier Letters
Combination Diacritic Marks
Greek and Coptic
Cyrlic
Cyrlic Supplement
Armenia
Hebrew
Arabic
Syriac
Arabic Sup.
Thaana
Nko
Samaritan
Mandaic
Devanagari
Bengali
Gumukhi
Gujarati
Oriya
Tamil
Telugu
Kannada
Malayalam
Sinhala
Thai
Lao
Tibetan
Myanmar
Georgian
Hangul Jamo
Ethiopic
Ethiopi Sup.
Cherokee
Unified Canadian Aboriginal Syllabics
Orgham
Runic
Tagbanwa
Buhid
Khmer
Mongolian
UCAS Extended
Limbu
Tai Le
New Tai Lue
Khmer symbols
Buginese
Thai Tham
Balinese
Sudanese
Batak
Lepcha
Ol Chiki
Vedic Extensions
Phonetic Extension Sup.
Phonetic Extension
Comb. Mks. Symb.
Latin Extended Additional
Greek Extended
Com. Mrk. Symb.
Genereral Punctuation
Super-/Subscripts Current Symbols
Letterlike Symbols
Number Forms
Arrows
Mathematical Operations
Miscellaneous Technical
Control Pictures
O.C.R
Enclosed Alphanumerics
Box Drawing
Blocck Element
Geometric Shapes
Miscellaneous Symbols
Dingbats
Misc. Math. Symb. A
S. Arrows-A
Braille Pattern
Supplement Arrows-B
Miscellaneous Mathematical Symbol B
Supplemental Mathematical Operrators
Miscellaneous Symbols and Arrows
Glagolitic
Latin Ext-C
Coptic
Georgian Sup.
Tifinagh
Ethiopic Extended
Cyrillic Ext-A
Supplemental Punctuation
CJK Radical Suplement
Kangxi Radicals
Ideo. Descr.
CJK Symbols and Punctuation
Hiragana
Katakana
Hangul Compatible Jamo
Bopomofo
Kanbun
Bopomofo Ext.
K.P.E
Enclosed CJK Letters and Months
CJK Compatibility
Tagalog
Hanunoo
43
24
Bi v mt phng b sung khc c dnh ring cho ch biu CJK b sung, SMP
(mt phng 1) khng c dng cho ti ny m ho cc ch biu CJK. Thay v
th, SMP c dng m ho cc k t ho c dng trong cc ch vit khc
trn th gii m cn cha c m ho trong BMP. Phn ln, nhng khng phi tt
c, cc ch vit c m ho trong SMP khng phi l cc ch vit sng ang dng
bi cc cng ng ngi dng hin i.
Tng quan v Mt phng a ng b sung cho cc ch vit v k hiu c v trong
hnh 7.
44
VNPF
25
SIP (mt phng 2) c dng cho cc ch biu CJK (ch biu ng thng
nht) vn khng c m ho trong BMP. Th tc thng nht v cc qui tc cho
b tr chng c m t trong Ph lc S.
SIP cng c dng cho cc ch biu CJK tng hp. Cc ch biu ny l cc k
t tng hp nh c xc nh trong 18.
Hnh sau ch ra mt tng quan v Mt phng ch biu b sung.
Hng
00
..
..
A6
A7
..
B7
B7
B8
..
F8
..
FA
FB
..
FF
= dnh cho chun ho tng lai
LU Bin ng bn trong cc hng ch v tr xp x.
26
VNPF
45
Byte-hng
27
Cc bng k t m ho ch Vit
46
VNPF
28
Tn quc t ca cc k t ch Vit
Ch
Tn
0018
0018 CANCEL
0000
0000 NULL
0019
0001
001A
001A SUBSTITUTE
0002
001B
001B ESCAPE
0003
001C
0004
001D
0005
0005 ENQUIRY
001E
0006
0006 ACKNOWLEDGE
001F
0007
0007 BELL
0020
0020 SPACE
0008
0008 BACKSPACE
0021
EXCLAMATION MARK
0009
0022
"
QUOTATION MARK
000A
0023
NUMBER SIGN
000B
0024
DOLLAR SIGN
000C
0025
PERCENT SIGN
000D
0026
&
AMPERSAND
000E
0027
'
APOSTROPHE
000F
000F SHIFT IN
0028
LEFT PARENTHESIS
0010
0029
RIGHT PARENTHESIS
0011
002A
ASTERISK
0012
002B
PLUS SIGN
0013
002C
COMMA
0014
002D
HYPHEN-MINUS
0015
002E
FULL STOP
0016
002F
SOLIDUS
0017
0030
DIGIT ZERO
VNPF
157
DIGIT ONE
004E
0032
DIGIT TWO
004F
0033
DIGIT THREE
0050
0034
DIGIT FOUR
0051
0035
DIGIT FIVE
0052
0036
DIGIT SIX
0053
0037
DIGIT SEVEN
0054
0038
DIGIT EIGHT
0055
0039
DIGIT NINE
0056
003A
COLON
0057
003B
SEMICOLON
0058
003C
<
LESS-THAN SIGN
0059
003D
EQUALS SIGN
005A
003E
>
GREATER-THAN SIGN
005B
003F
QUESTION MARK
005C
REVERSE SOLIDUS
0040
COMMERCIAL AT
005D
0041
005E
CIRCUMFLEX ACCENT
0042
005F
LOW LINE
0043
0060
GRAVE ACCENT
0044
0061
0045
0062
0046
0063
0047
0064
0048
0065
0049
0066
004A
0067
004B
0068
004C
0069
004D
006A
158
VNPF
0323
006C
006D
0102
006E
00C2
006F
00CA
0070
00D4
0071
01A0
0072
01AF
0073
0110
0074
00C0
0075
1EA2
0076
00C3
0077
00C1
0078
1EA0
0079
1EB0
007A
1EB2
007B
1EB4
007C
VERTICAL LINE
1EAE
007D
1EB6
007E
TILDE
1EA6
007F
DELETE
1EA8
NO-BREAK SPACE
1EAA
00A0
0300
1EA4
0301
1EAC
0302
COMBINING CIRCUMFLEX
ACCENT
00C8
0303
COMBINING TILDE
1EBA
COMBINING BREVE
1EBC
00C9
COMBINING HORN
1EB8
0306
0309
031B
VNPF
159
1EC2
1EC4
1EBE
1EC6
00CC
1EC8
0128
00CD
1ECA
00D2
1ECE
00D5
00D3
1ECC
1ED2
1ED4
1ED6
1ED0
1ED8
1EDC
1EDE
1EE0
1EDA
1EE2
00D9
1EE6
0168
00DA
160
1EE4
1EEA
1EEC
1EEE
1EE8
1EF0
1EF2
1EF6
1EF8
00DD
1EF4
0103
00E2
00EA
00F4
01A1
01B0
0111
00E0
1EA3
00E3
00E1
1EA1
1EB1
1EB3
1EB5
1EAF
1EB7
VNPF
1EA9
1EAB
1EA5
1EAD
00E8
1EBB
1EBD
00E9
1EB9
1EC1
1EC3
1EC5
1EBF
1EC7
00EC
1EC9
0129
00ED
1ECB
00F2
1ECF
00F5
00F3
1ECD
1ED3
1ED5
1ED7
VNPF
1ED1
1ED9
1EDD
1EDF
1EE1
1EDB
1EE3
00F9
1EE7
0169
00FA
1EE5
1EEB
1EED
1EEF
1EE9
1EF1
1EF3
1EF7
1EF9
00FD
1EF5
201C
201D
20AB
DONG SIGN
25CC
DOTTED CIRCLE
161
1799
KHMER LETTER YO
KHMER LETTER KA
179A
KHMER LETTER RO
1781
179B
KHMER LETTER LO
1782
KHMER LETTER KO
179C
KHMER LETTER VO
1783
179D
1784
179E
1785
KHMER LETTER CA
179F
KHMER LETTER SA
1786
17A0
KHMER LETTER HA
1787
KHMER LETTER CO
17A1
KHMER LETTER LA
1788
17A2
KHMER LETTER QA
1789
17A3
178A
KHMER LETTER DA
17A4
178B
17A5
178C
KHMER LETTER DO
17A6
178D
17A7
178E
17A8
178F
KHMER LETTER TA
17A9
1790
17AA
1791
KHMER LETTER TO
17AB
1792
17AC
1793
KHMER LETTER NO
17AD
1794
KHMER LETTER BA
17AE
1795
17AF
1796
KHMER LETTER PO
17B0
1797
17B1
1798
KHMER LETTER MO
17B2
Ch
1780
162
KHMER INDEPENDENT
VOWELQUK
KHMER INDEPENDENT VOWEL
QUU
KHMER INDEPENDENT VOWEL
QUUV
VNPF
17D0
17B4
17D1
17B5
17D2
17B6
17D3
17B7
17C4
17B8
17D5
17B9
17D6
17BA
17D7
17BB
17D8
17BC
17D9
17BD
17DA
17BE
17DB
17BF
17DC
17C0
17DD
17C1
17E0
17C2
17E1
17C3
17E2
17C4
17E3
17C5
17E4
17C6
17E5
17C7
17E6
17C8
17E7
17C9
17E8
17CA
17E9
17CB
17F0
17CC
17F1
17CD
17F2
17CE
17F3
17CF
17F4
VNPF
163
17F6
17F7
19F0
17F8
19F1
17F9
19F2
19E0
19F3
19E1
19F4
19E2
19F5
19E3
19F6
19E4
19F7
19E5
19F8
19E6
19F9
19E7
19FA
19E8
19FB
19E9
19FC
19EA
19FD
19EB
19FE
19EC
19FF
19ED
164
19EE
19EF
VNPF
CHAM LETTER PA
CHAM LETTER A
AA1B
CHAM LETTER I
AA1C
AA02
CHAM LETTER U
AA1D
CHAM LETTER BA
AA03
CHAM LETTER E
AA1E
AA04
CHAM LETTER AI
AA1F
AA05
CHAM LETTER O
AA20
CHAM LETTER MA
AA06
CHAM LETTER KA
AA21
AA07
AA22
CHAM LETTER YA
AA08
CHAM LETTER GA
AA23
CHAM LETTER RA
AA09
AA24
CHAM LETTER LA
AA0A
AA25
CHAM LETTER VA
AA0B
AA26
AA0C
AA27
CHAM LETTER SA
AA0D
AA28
CHAM LETTER HA
AA0E
CHAM LETTER JA
AA29
AA0F
AA2A
AA10
AA2B
AA11
AA2C
AA12
AA2D
AA13
CHAM LETTER TA
AA2E
AA14
AA2F
AA15
CHAM LETTER DA
AA30
AA16
AA31
AA17
AA32
AA18
CHAM LETTER NA
AA33
AA19
AA34
Ch
AA00
AA01
VNPF
Tn
165
AA4D
AA36
AA50
AA40
AA51
AA41
AA52
AA42
AA53
AA43
AA54
AA44
AA55
AA45
AA56
AA46
AA57
AA47
AA58
AA48
AA59
AA49
AA5C
AA4A
AA5D
AA4B
AA5E
AA4C
AA5F
166
VNPF
AA9A
AA9B
AA82
AA9C
AA83
AA9D
AA84
AA9E
AA85
AA9F
AA86
AAA0
AA87
AAA1
AA88
AAA2
AA89
AAA3
AA8A
AAA4
AA8B
AAA5
AA8C
AAA6
AA8D
AAA7
AA8E
AAA8
AA8F
AAA9
AA90
AAAA
AA91
AAAB
AA92
AAAC
AA93
AAAD
AA94
AAAE
AA95
AAAF
AA96
AAB0
AA97
AAB1
AA98
AAB2
Ch
AA80
AA81
VNPF
Tn
167
AABE
AAB4
AABF
AAB5
AAC0
AAB6
AAC1
AAB7
AAC2
AAB8
AAC3
TAIVIET
AAB9
AADB
AABA
AADC
AABB
AADD
AABC
AADE
AABD
AADF
168
VNPF
28.5 Tn quc t ca ch Hn Nm
28.5.1 B th
M
Ch
2E80
2E99
2E9A
2E9B
2E81
2E82
2E83
2E84
2E85
2E86
2E87
2E88
2E89
2E8A
2E8B
2E8C
2E8D
2E8E
2E8F
2E90
2E91
2E92
2E93
2E94
2E95
2E96
2E97
2E98
2E9C
2E9D
2E9E
2E9F
VNPF
Tn
CJK RADICAL REPEAT
CJK RADICAL CLIFF
CJK RADICAL SECOND ONE
CJK RADICAL SECOND TWO
CJK RADICAL SECOND THREE
CJK RADICAL PERSON
CJK RADICAL BOX
CJK RADICAL TABLE
CJK RADICAL KNIFE ONE
CJK RADICAL KNIFE TWO
CJK RADICAL DIVINATION
CJK RADICAL SEAL
CJK RADICAL SMALL ONE
CJK RADICAL SMALL TWO
CJK RADICAL LAME ONE
CJK RADICAL LAME TWO
CJK RADICAL LAME THREE
CJK RADICAL LAME FOUR
CJK RADICAL SNAKE
CJK RADICAL THREAD
CJK RADICAL SNOUT ONE
CJK RADICAL SNOUT TWO
CJK RADICAL HEART ONE
CJK RADICAL HEART TWO
CJK RADICAL HAND
CJK RADICAL RAP
2EA0
2EA1
2EA2
2EA3
2EA4
2EA5
2EA6
2EA7
2EA8
2EA9
2EAA
2EAB
2EAC
2EAD
2EAE
2EAF
2EB0
2EB1
2EB2
2EB3
2EB4
2EB5
2EB6
2EB7
2EB8
2EB9
2EBA
2EBB
2EBC
2EBD
2EBE
2EBF
169
170
2EE5
2EE6
2EE7
2EE8
2EE9
2EEA
2EEB
2EEC
2EEF
2EF0
2EF1
2EF2
2EF3
2F00
2F01
2F02
2F03
2F04
2F05
2F06
2F07
2F08
2F09
2F0A
2F0C
2F0D
2F0E
2F0F
2F10
2F11
2F12
2F13
2F14
2F15
2EED
2EEE
2F0B
VNPF
VNPF
2F3C
2F3D
2F3E
2F3F
2F40
2F41
2F42
2F43
2F44
2F45
KANGXI RADICAL GO
2F46
2F47
2F48
2F49
2F4A
2F4B
2F4C
2F4D
2F4E
2F4F
2F50
2F51
2F52
2F53
2F54
2F55
2F56
2F57
2F58
2F59
2F5A
2F5B
2F5C
2F5D
2F5E
2F5F
2F3B
171
172
2F86
2F87
2F88
2F89
2F8A
2F8B
2F8C
2F8D
2F8E
2F8F
2F90
2F91
2F92
2F93
2F94
2F95
2F96
2F97
2F98
2F99
2F9A
2F9B
2F9C
2F9D
2F9E
2F9F
2FA0
2FA1
2FA2
2FA3
2FA4
2FA5
2FA6
2FA7
2FA8
2FA9
2FAA
VNPF
VNPF
2FCD
2FCF
2FD0
2FD1
2FD2
2FF0
2FF1
2FF2
2FF3
2FF4
2FF5
2FCE
2FD3
2FD4
2FD5
2FF6
2FF7
2FF8
2FF9
2FFA
2FFB
173
...
4E00
4E01
4E03
4E07
4E08
4E09
4E0A
4E0B
4E0D
4E0E
4E10
4E11
...
9F90
9F95
9F9C
FA24
174
VNPF
...
2A69A
2A6A4
2A6C5
2A6C7
EXTENSION C
2A700
2A964
2B52C
2A709
2A712
2A715
2A718
2A71A
...
2B6CE
2B6D0
2B6D5
2B708
2B70D
2B717
2B727
VNPF
175
Ph lc A
K t m t ch biu
(thng tin)
K t m t ch biu - Ideographic Description Character (IDC) l mt k t ho,
c dng cng vi mt dy cc k t ho khc to ra Dy m t ch biu
Ideographic Description Sequence (IDS). Dy nh vy c th c dng m t k
t ch biu m cha c xc nh trong chun ny v cc chun quc t.
IDS m t cho ch biu di dng tru tng. N khng c din gii nh mt k
t hp thnh v khng ng bt k dng ti to no.
LU IDS khng phi l k t v do khng phi l thnh vin ca kho ch ISO/IEC 10646.
ch biu m ho
b th m ho
IDS khc
LU 1 M t trn ng rng bt k IDS no cng c th c lng bn trong IDS khc.
nh ngha v vit tt ch u ca n,
176
VNPF
s
cc
DC
Vit tt v c
php ca IDS
IDC-LTR D1 D2
TRN XUNG DI
VNPF
IDC-ATB D1 D2
IDC-LMR D1 D2 D3
V tr tng
i ca cc
DC
D1 D2
D1 D2
D1D2D3
V d v IDS
V d
IDS
biu
din:
177
TRN GIA V DI
IDC-AMB D1 D2 D3
D1 D2
D3
BAO Y
BAO T TRN
BAO T DI
CHM LP
IDC-FSD D1 D2
IDC-SAV D1 D2
IDC-SBL D1 D2
IDC-SLT D1 D2
IDC-SUL D1 D2
IDC-SUR D1 D2
IDC-SLL D1 D2
IDC-OVL D1 D2
D2
D2 D1
D1 D2
D2 D1
D2D1
D2 D1
D2 D1
D2 D1
* D1
178
VNPF
Ph lc B
Hng dn t tn k t
(thng tin)
Mc 22 ca chun ny xc nh cc qui tc hnh thnh tn v tnh duy nht ca
tn. Cc qui tc ny c dng trong cc chun tp k t m ho cng ngh thng tin
khc nh ISO/IEC 646, ISO/IEC 6937, ISO/IEC 8859, v ISO/IEC 10367. Ph lc ny
cung cp hng dn ph cho vic to ra cc tn thc th duy nht ny.
Nhng hng dn ny khng p dng cho tn ch biu CJK v m tit Hangul
c hnh thnh bng vic dng cc qui tc c xc nh trong mc 22.5 v 22.6
tng ng.
Hng dn 1
Tn
Vit tt ch u
LOCKING-SHIFT TWO RIGHT
LS2R
SOFT HYPHEN
SHY
IPA
LU Trong ISO/IEC 6429, cc tn ca phng thc cng c trnh by theo cng cch nh
chc nng iu khin.
Hng dn 3
Hng dn 4
VNPF
179
Script
Attribute
Case
Designation
Type
Mark(s)
Language
Qualifier
V D V CC THUT NG NH VY
Script Latin, Cyrillic, Arabic
Case capital, small
Type letter, ligature, digit
Language Ukrainian
Attribute final, sharp, subscript, vulgar
Designation customary name, name of letter
Mark acute, ogonek, ring above, diaeresis
Qualifier sign, symbol
V D V CC TN
LATIN CAPITAL LETTER A WITH ACUTE
1
2
3
6
7
DIGIT FIVE
3
6
LEFT CURLY BRACKET
5
5
6
LU 1 Nt ch l mt k hiu ho trong hai hay nhiu k hiu ho khc c to nh
nh mt k hiu ho n.
Hng dn 5
180
VNPF
Hng dn 6
Hng dn 7
Hng dn 8
MICRO SIGN
Hng dn 10
APOSTROPHE
COLON
COMMERCIAL AT
LOW LINE
TILDE
Hng dn 11
VNPF
181
182
UNDERTIE (Enotikon)
VNPF
Ph lc C
Th tc thng nht ho v thu xp ch biu
CJK
(thng tin)
Tuyn tp k t ho ca ch biu thng nht trong ISO/IEC 10646 c xc nh
trong 30. Chng c suy dn ra t nhiu ch biu c tm thy trong a dng
chun quc gia v vng khc nhau vi cc tp k t m ho ("ngun").
Ph lc ny m t cch ch biu trong chun ny c suy dn ra t cc ngun
bng vic p dng tp cc th tc thng nht. N cng m t cch cc ch biu
trong chun ny thu xp vo dy cc im m k tip m chng c gn cho.
Tham chiu ngun cho cc ch biu thng nht CJK c xc nh trong 22.1.
Bn trong ng cnh ca ISO/IEC 10646 qui trnh thng nht c p dng cho cc k
t ch biu c ly ra t cc m trong cc nhm ngun. Trong qui trnh ny, mt
ch biu t hai hay nhiu nhm c lin kt vi nhau, mt im m duy nht c
gn cho chng trong chun ny. Cc lin kt c thc hin tng ng vi tp cc
th tc c m t di y. Cc ch biu c lin kt s c m t y l
"c thng nht.
LU Qui trnh thng nht khng p dng cho cc tuyn tp sau ca cc ch biu :
CJK RADICALS SUPPLEMENT ( 2F00 - 2EFF)
KANGXI RADICALS (2F00 - 2FDF)
CJK COMPATIBILITY IDEOGRAPHS (F900 - FAFF vi ngoi l FA0E, FA0F, FA11, FA13, FA14,
FA1F, FA21, FA23, FA24, FA27, FA28 v FA29)
CJK COMPATIBILITY IDEOGRAPHS SUPPLEMENT (2F800-2FA1F).
,
LU S khc bit v hnh dng gia hai ch biu trong v d trn l chiu di ca ng
ngang di thp. iu ny c xem xt l s khc bit thc t ca hnh ch. Hn na cc ch
biu ny c ngha khc nhau. Ngha ca ch th nht l "S" cn ngha ca ch th hai la "Th ".
VNPF
183
C.1.3 Th tc
Th tc thng nht c dng xc nh liu hai ch biu c cng mt hnh tru
tng hay l cc ch khc bit. Th tc thng nht c hai giai on, c p dng
theo th t sau:
a) Phn tch cu trc cu phn;
b) Phn tch tnh nng cu phn;
C.1.3.1 Phn tch cu trc cu phn
VNPF
Nu mt hay nhiu tnh nng a) ti c) trn l khc nhau gia cc ch biu trong
so snh, cc ch biu ny c coi l c hnh tru tng khc nhau v do
khng c thng nht.
Nu tt c cc tnh nng a) ti c) trn u l nh nhau gia cc ch biu , cc ch
biu ny c coi l c cng hnh trwcu tng v do c thng nht.
, ,
C.1.4.2 V tr tng i ca cc cu phn khc nhau
,
C.1.4.3 Cu trc khc nhau ca cc cu phn tng ng
, , , , , , ,
, , , , , ,
, , , , , ,
C.1.5 Khc hnh dng thc ti
minh ho cho phn lp c m t trong S.1.2, mt s v d in hnh v cc ch
biu c thng nht c nu di y. Hai hay ba ch biu trong tng nhm
di y c hnh dng thc t khc nhau, nhng chng c coi l c cng hnh tru
tng, v do c thng nht.
,
VNPF
185
, , , , ,
b) Khc bit trong phng i ch khi u nt v/hoc ch kt thc
, , , , ,
c) Khc bit ch tip xc ca nt
, ,
d) Khc bit ch nh ra ti gc gp ca nt
e) Khc bit cc nt un
f) Khc bit phn sau gp ch kt thc nt
,
h) Khc bit trong thay i "nc"
,
i) T hp ca cc khc bit trn
Nhng khc bit ny trong hnh dng thc t ca ch biu thng nht c trnh
by trong cc ct ngun tng ng cho tng im m trong s m trong mc 30
ca chun ny.
186
VNPF
C.2 Th tc sp xp
C.2.1 Phm vi sp xp
Sp xp cho CJK UNIFIED IDEOGRAPHS trong s m ca mc 30 ca chun ny
c da trn vic sp th t theo cc t in sau.
u tin
1
2
3
4
T in Kangxi Dictionary
Beijing
Daikanwa Jiten
Hanyu Dazidian
Daejaweon
Xut bn ln th 9
xut bn ln th 9
xut bn ln th nht
xut bn ln th nht
C.2.2 Th tc
C.2.2.1 Ch biu c tm thy trong cc t in
a)
Nu mt ch biu c tm thy trong t in Kangxi Dictionary, n c
nh v theo s m tng ng vi th t ca Kangxi Dictionary.
b)
Nu mt ch biu khng c tm thy trong Kangxi Dictionary nhng c
tm thy trong Daikanwa Jiten, n c cho v tr cui ca nhm b th-nt m di
n c ly ch s gn nht vi k t Daikanwa Jiten i trc, cng xut hin trong t
in Kangxi dictionary.
c)
Nu mt ch biu c tm thy khng c trong c Kangxi ln Daikanwa, cc
t in Hanyu Dazidian v Daejaweon c tham chiu ti theo th tc tng t.
C.2.2.2 Ch biu khng tm thy trong cc t in
VNPF
187
4E1F 4E22
4FF1 5036
T
4E48 5E7A
5024 503C
TJ
4E89 722D
5077 5078
5204
TJ
4EDE 4EED
507D 50DE
520B
T
4F75 5002
514C 5151
522A
514E 5154
522B
TJ
4FC1 4FE3
5156 5157
52B5
53C3 53C4
5759 5DE0
541E 5451
188
5716 5717
5415 5442
TJ
57D2 57D3
TJ
5848 588D
5239
53C1
GT
53C2
598D 59F8
5BDC 5BE7
GT
GTJ
5BDD 5BE2
59EB 59EC
5377
5DFB
59CD 59D7
TJ
5238
524E
5373
537D
518A 518C
TK
5225
5355
5220
TJ
4FA3 4FB6
5358
4FDE 516A
52FB
5300
520A
TJK
5292
5294
5203
TJ
525D
5265
51E2
51E3
GTJ
524F
5259
51C0 51C8
GT
5C02 5C08
GTJ
5C06 5C07
VNPF
5436 5450
5861 586B
543F 544A
5897 589E
55A9 55BB
5618 5653
5910 657B
568F 5694
5932 672C
56EF 56FD
5965 5967
TJ
5708 570F
TJ
570E 5713
5986 599D
T
5E76 5E77
TJ
60E0
60A6
60AE
609E
5BAB 5BAE
5E21 5E32
5F39 5F3E
614D
6120
TJ
5E2F 5E36
TJ 633F
66FD
66FE
TJ
634F
67B4
67FA
TJ
635C 641C
67FB
67E5
63B2
67F5
6805
GT
63ED
VNPF
5DD3 5DD4
60B3
60EA
5F37 5F3A
5B73 5B76
63D1
5F11 5F12
6085
5C4F 5C5B
63D2 63F7
5CE5 5D22
5C36 5C37
5B24 5B37
5EC4 5ECF
5BDB 5BEC
6075
5C2A 5C2B
GT
GT
5B0E 5B14
GTJ
5C19 5C1A
TK
GTJ
5AAF 5B00
58FD 5900
5C13 5C14
5AAA 5ABC
5A55 5AAB
GTJ
58EE 58EF
5A7E 5AAE
5527 559E
TJ
6416 6447
TJ 63FA
68B2
68C1
189
TJ
5F50 5F51
614E
613C
63FE 6435
6986
GT
5F54 5F55
622C
5F59 5F5A
6231
622F
5F5B 5F5C
6237 6238
T 6236
5F5D 5F5E
623E
62CB
663B
5FB3 5FB7
62D4
6B72 6B73
6A23
6329 635D
66C1
6DF8 6E05
T
665A
6669
69D8
629C
TJ
6602
TJ
69C7
69D9
T
629B
65E2
65E3
699D
6A27
T
623B
5FB4 5FB5
6553
655A
6985
69B2
T
5F65 5F66
654E
6559
6982
69EA
6483
64CA
6961
TJ
6229
6A2A
6A6B
66A8
6B65
TJ
6B7F 6B81
6E07 6E34
74F6 7501
7BE1 7C12
6BBB 6BBC
7522 7523
T
6E29 6EAB
T
7CA4 7CB5
J
6BC0 6BC1
6E88 6F59
75E9 7626
7D55 7D76
6BCE 6BCF
6E89 6F11
76A1 76A5
7DA0 7DD1
TJ
6C32 6C33
6EDA 6EFE
771E 771F
7DD2 7DD6
190
7BB3 7C08
T
GTJ
6B69
7464 7476
T
VNPF
TJK
GTJK
6C5A 6C61
6F5B 6FF3
773E 8846
TJ
7DE3 7E01
T
6C92 6CA1
7028 702C
7814 784F
7DFC 7E15
TJ
GTJ
TJ
6D44 6DE8
70BA 7232
797F 7984
7E48 7E66
GTJK
TJ
6D89 6E09
712D 7162
79BF 79C3
7FAE 7FB9
6D97 6D9A
7155 7199
7A05 7A0E
7FF6 7FFA
TJ
6D99 6DDA
7174 7185
7A42 7A57
80FC 8141
T
86FB
GJ
885B
885E
812B 8131
T
8FBE 8FD6
TJK
GT
8203
7B5D 7B8F
8715
817D 8183
GT
72B6 72C0
6DE5 6E0C
T
95B1
95B2
TJ
8FF8 902C
9667 9689
8204
TJ
TK
886E
889E
820D 820E
J
9059 9065
9752
T
GJK
GTJ
8216 8217
88C5 88DD
90A2 90C9
975C
TJ
8358 838A
8A2E 8A7D
90CE 90DE
9771
976D
83D1 8458
8AAA 8AAC
9109 9115
TJ
T 90F7
9839
983D
T
TJ
8480 8495
8ACC 8AEB
9196 919E
9854
VNPF
9759
TJ
9751
984F
191
GJ
848B
8B20
8B21
91A4 91AC
985B
985A
8523
T
848D 853F
8C5C 8C63
9292
9203
8570 8580
8D71
92B3 92ED
T
85AB 85B0
8EFF 8F27
9332
TJK
9304
TK
TK
85F4 860A
8F1C 8F3A
932C 934A
9A08
865A 865B
8F3C 8F40
TJ
93AD 93AE
99B1
99C4
9905
9920
TJ
8D70
98EE
98F2
TJ
99E2
9AA9
9AAB
TJJ T
9AD8 9AD9
TJ
JT
5191 80C4
192
non cognate
S.1.4.3
non cognate
S.1.4.3
VNPF
S.1.4.3
S.1.4.1
non cognate
S.1.4.3
S.1.4.1
non cognate
S.1.4.3
S.1.4.3
8008 8009
S.1.4.3
S.1.4.3S.1.4.1
S.1.4.3
S.1.4.2
5B7C 5B7D
VNPF
non cognate
non cognate
S.1.4.3
S.1.4.3
S.1.4.2
S.1.4.3
193