You are on page 1of 6

NN HUFFMAN

M C TIU
Hon thnh bi thc hnh ny, sinh vin c th: Nm r cc bc ca thut ton nn huffman tnh.

TH I GIAN TH C HNH
T 120 pht n 400 pht

TM T T

Nn Huffman l phng php m ha bng m c di thay i (variable length encoding) trong ch s dng vi bit biu din 1 k t v di m bit cho cc k t khng ging nhau (k t xut hin nhiu ln c biu din bng m ngn v ngc li). Thut ton nn Huffman bao gm 5 bc: B1: Lp bng thng k tn s xut hin ca cc k t. B2: Xy dng cy Huffman da vo bng thng k tn s xut hin. B3: Pht sinh bng m bit cho tng k t tng ng. B4: Duyt tp tin, thay th cc k t trong tp tin bng m bit tng ng. B5: Lu li thng tin ca cy Huffman cho gii nn.

N I DUNG TH C HNH
Ci t thut ton nn Huffman. D liu c nhp t file input.txt. Xut ra mn hnh chui bit tng ng vi ni dung nhp.

V d Vi file input.txt c ni dung: ABBCCCDDDD Chui bit tng ng khi xut ra s l 10010110 11111110 000 Trong A (100) B (101) C(11) v D(0)

PHN TCH
biu din mt nt trong cy huffman, ta dng:
struct NODE { unsigned char c; int freq; bool used; int int }; nLeft; nRight; // // // // // // k t c a node s l n xu t hi n nh d u node c thu c b ng th ng k ko used = true: ko thu c b ng th ng k ch s nt con n m bn tri ch s nt con n m bn ph i

Cy Huffman c lu tr di dng mng


NODE huffTree[MAX_NODE];

Trong MAX_NODE l 1 s nguyn > s lng nt ti a ca cy Huffman. (Ta c th thy s lng nt ti a ca cy Huffman s < 2^8 * 9) biu din mt m bit, ta dng
struct MABIT { char* int }; bits; soBit; // chua mang bit // so luong bit cua ma

Bng m bit: m bit ca 256 k t


MABIT bangMaBit[256];

BC 1: LP BNG THNG K TN S XUT HIN Ta to ra 256 node tng ng vi 256 k t ASCII. Sau , c tng k t tng ng t tp tin nhp v tng tn s xut hin ca node tng ng k t nhp. Khi to node:
for (int i = 0; i < 256; i++) { huffTree[i].c = i; huffTree[i].freq = 0;

Thng k tn s:
while (1) { fscanf(fi, "%c", &c); if (feof(fi)) { break; } huffTree[c].freq ++; // tang tan so xuat hien cua ky tu c }

Nhn xt: Vic to ra 256 node c th d do trong vn bn him khi xy ra trng hp 256 k t ASCII cng xut hin nhng cho php ta truy xut nhanh chng n node tng ng vi k t c (vi quy c node huffTree[c] tng ng vi k t c). iu ny rt quan trng khi lm vic vi cc tp tin c di ln. BC 2: XY DNG CY HUFFMAN Cc bc pht sinh cy: B1: Chn trong bng thng k 2 phn t c tn sut thp nht. B2: To 2 node ca cy cng vi node cha z c trng s bng tng trng s 2 nt con. B3: Loi 2 phn t x, y khi bng thng k. B4: Thm phn t z vo bng thng k. B5: lp li bc 1 n bc 4 cho n khi cn 1 phn t trong bng thng k.

Vi cch lu tr cy Huffman dng mng nh trn, cc thao tc c thc hin nh sau: Chn trong bng thng k c tn sut thp nht

Tm 2 phn t trong mng huffTree c tn sut thp nht. Ch xt cc phn t thuc bng thng k ( huffTree[i].used == false) v c xut hin trong file d liu (huffTree[i].freq > 0) To node mi ca cy Huffman c 2 nt con i, j

Thm node c to vo cui mng, s dng nLeft v nRight lu v tr 2 nt con.


huffTree[nNode].freq = huffTree[i].freq + huffTree[j].freq; huffTree[nNode].nLeft = i; huffTree[nNode].nRight = j;

Loi phn t x, y khi bng thng k.

loi x, y khi bng thng k, gn


huffTree[i].used = true; huffTree[j].used = true;

Thm phn t nNode vo bng thng k


huffTree[nNode].used = false;

V d: vi d liu vo ABBCCCDDDD, bng thng k ban u:

C Freq Used nLeft nRight

65 A 1 FALSE -1 -1

66 B 2 FALSE -1 -1

67 C 3 FALSE -1 -1

68 D 4 FALSE -1 -1

255 ? 0 FALSE -1 -1

Tm 2 phn t c tn sut thp nht: A (1) v B (2). To node mi (nt 256) . Loi A, B khi bng thng k v thm nt 256 vo bng thng k 65 A 1 TRUE -1 -1 66 B 2 TRUE -1 -1 67 C 3 FALSE -1 -1 68 D 4 FALSE -1 -1 255 ? 0 FALSE -1 -1 256 A 3 FALSE 65 66

C Freq Used nLeft nRight

Tm 2 phn t c tn sut thp nht: C (3) v 256 (3) (Lu A, B c used = TRUE khng c xt) . To node mi (nt 257) . Loi C, 256 khi bng thng k v thm nt 257 vo bng thng k 65 A 1 TRUE -1 -1 66 B 2 TRUE -1 -1 67 C 3 TRUE -1 -1 68 D 4 FALSE -1 -1 255 ? 0 FALSE -1 -1 256 A 3 TRUE 65 66 257 A 6 FALSE 256 67

C Freq Used nLeft nRight

Tng t, sau khi to nt 258, ch cn 1 phn t thuc bng thng k (258). Qu trnh to cy dng. 65 A 1 TRUE -1 -1 66 B 2 TRUE -1 -1 67 C 3 TRUE -1 -1 68 D 4 TRUE -1 -1 255 ? 0 FALSE -1 -1 256 A 3 TRUE 65 66 257 A 6 TRUE 256 67 258 A 10 FALSE 68 256

C Freq Used nLeft nRight

Sau qu trnh to cy, nt huffTree[nNode 1] chnh l nt gc ca cy Huffman. BC 3: PHT SINH BNG M BIT Duyt cy Huffman t nt gc n nt l ca cc k t.

Nt gc ca cy Huffman: huffTree[nNode 1] Kim tra mt nt l nt l: huffTree[i].nLeft == -1 && huffTree[i].rRight == -1 Duyt quy t nt gc ca cy huffman. Thm bit 0 vo khi i qua nhnh tri, thm bit 1 khi i qua nhnh phi. BC 4: THAY TH K T BNG M BIT S dng 1 bin unsigned char out lu chui bit xut ra. Duyt cc k t, bt (gn = 1) cc bit tng ng ca bin out (ty theo m bit ca k t). Xut ra bin out khi chui bit c di 8 ( 1 byte). n tp: THAO TC X L BIT

Tham kho cc php x l bit c bn (http://forums.congdongcviet.com/showthread.php?t=316) 2 thao tc cn thc hin: Bt bit i ca s nguyn out
out = out | (1 << i);

V d: bt bit 2 ca bin out 7 0 0 0 6 0 0 0 5 1 0 1 4 0 0 0 3 0 0 0 2 0 1 1 1 0 0 0 0 0 0 0

Out 1 << 2 out | (1 << 2)

Ly bit i ca s nguyn out


(out >> i) i& 1

V d: ly bit 2 ca bin out


Out out >> 2 1 (out >> 2) &1 7 0 0 0 0 6 0 0 0 0 5 1 0 0 0 4 0 0 0 0 3 0 1 0 0 2 1 0 0 0 1 0 0 0 0 0 0 1 1 1

CODE THAM KH O (StaticHuffman.rar)


Xem code i km.

YU C U
1. Bin dch chng trnh tham kho. 2. Nhp d liu input.txt c ni dung ACBCDBDCDD. Chy tay thut ton nn Huffman, cho bit kt qu bng thng k, cy huffman, bng m bit v chui bit xut ra. 3. Trong hm NenHuffman() B du comment (//) cc dng
// // // XuatBangThongKe(); XuatCayHuffman(nRoot, 0); XuatBangMaBit();

Chy chng trnh, so snh kt qu xut ra mn hnh vi kt qu ca cu 2.

4. Tr li cu hi cc dng c ghi ch 2 hm ThongKeTanSoXuatHien() v XuatBangThongKe(). 5. Tr li cu hi cc dng c ghi ch 2 hm TaoCayHuffman() v Tim2PhanTuMin(). 6. Tr li cu hi cc dng c ghi ch 2 hm PhatSinhMaBit() v DuyetCayHuffmanx(). 7. Tr li cu hi cc dng c ghi ch 2 hm NenHuffman() v MaHoa1KyTu() v XuatByte().

p d ng - Nng cao
8. Sa li chng trnh xut ni dung nn ra file output.txt thay v xut ra mn hnh. V d: vi v d trn, thay v xut ra mn hnh chui bit 10010110 11111110 000, xut ra file 3 byte

trong (10010110) (11111110) (byte th 3 l 0 nn khng th hin th)

9. Sa li chng trnh xut ni dung nn v cc thng tin cn thit ra file output.huf sao cho c th s dng file output.huf cho qu trnh gii nn. 10. Vit hm gii nn: c ni dung t file output.huf cu 9 v xut ni dung gii nn ra file decode.txt. So snh ni dung decode.txt v input.txt

You might also like