词条 | 哈夫曼树 |
释义 | 给定n个权值作为n个叶子结点,构造一棵二叉树,若带权路径长度达到最小,称这样的二叉树为最优二叉树,也称为哈夫曼树(Huffman tree)。 基本术语哈夫曼树又称为最优树. 1、路径和路径长度 在一棵树中,从一个结点往下可以达到的孩子或子孙结点之间的通路,称为路径。通路中分支的数目称为路径长度。若规定根结点的层数为1,则从根结点到第L层结点的路径长度为L-1。 2、结点的权及带权路径长度 若将树中结点赋给一个有着某种含义的数值,则这个数值称为该结点的权。结点的带权路径长度为:从根结点到该结点之间的路径长度与该结点的权的乘积。 3、树的带权路径长度 树的带权路径长度规定为所有叶子结点的带权路径长度之和,记为WPL。 哈夫曼树的构造假设有n个权值,则构造出的哈夫曼树有n个叶子结点。 n个权值分别设为 w1、w2、…、wn,则哈夫曼树的构造规则为: (1) 将w1、w2、…,wn看成是有n 棵树的森林(每棵树仅有一个结点); (2) 在森林中选出两个根结点的权值最小的树合并,作为一棵新树的左、右子树,且新树的根结点权值为其左、右子树根结点权值之和; (3)从森林中删除选取的两棵树,并将新树加入森林; (4)重复(2)、(3)步,直到森林中只剩一棵树为止,该树即为所求得的哈夫曼树。 哈夫曼树的应用1、哈夫曼编码 在数据通信中,需要将传送的文字转换成二进制的字符串,用0,1码的不同排列来表示字符。例如,需传送的报文为“AFTER DATA EAR ARE ART AREA”,这里用到的字符集为“A,E,R,T,F,D”,各字母出现的次数为{8,4,5,3,1,1}。现要求为这些字母设计编码。要区别6个字母,最简单的二进制编码方式是等长编码,固定采用3位二进制,可分别用000、001、010、011、100、101对“A,E,R,T,F,D”进行编码发送,当对方接收报文时再按照三位一分进行译码。显然编码的长度取决报文中不同字符的个数。若报文中可能出现26个不同字符,则固定编码长度为5。然而,传送报文时总是希望总长度尽可能短。在实际应用中,各个字符的出现频度或使用次数是不相同的,如A、B、C的使用频率远远高于X、Y、Z,自然会想到设计编码时,让使用频率高的用短码,使用频率低的用长码,以优化整个报文编码。 为使不等长编码为前缀编码(即要求一个字符的编码不能是另一个字符编码的前缀),可用字符集中的每个字符作为叶子结点生成一棵编码二叉树,为了获得传送报文的最短长度,可将每个字符的出现频率作为字符结点的权值赋予该结点上,求出此树的最小带权路径长度就等于求出了传送报文的最短长度。因此,求传送报文的最短长度问题转化为求由字符集中的所有字符作为叶子结点,由字符出现频率作为其权值所产生的哈夫曼树的问题。利用哈夫曼树来设计二进制的前缀编码,既满足前缀编码的条件,又保证报文编码总长最短。 哈夫曼静态编码:它对需要编码的数据进行两遍扫描:第一遍统计原数据中各字符出现的频率,利用得到的频率值创建哈夫曼树,并必须把树的信息保存起来,即把字符0-255(2^8=256)的频率值以2-4BYTES的长度顺序存储起来,(用4Bytes的长度存储频率值,频率值的表示范围为0--2^32-1,这已足够表示大文件中字符出现的频率了)以便解压时创建同样的哈夫曼树进行解压;第二遍则根据第一遍扫描得到的哈夫曼树进行编码,并把编码后得到的码字存储起来。 哈夫曼动态编码:动态哈夫曼编码使用一棵动态变化的哈夫曼树,对第t+1个字符的编码是根据原始数据中前t个字符得到的哈夫曼树来进行的,编码和解码使用相同的初始哈夫曼树,每处理完一个字符,编码和解码使用相同的方法修改哈夫曼树,所以没有必要为解码而保存哈夫曼树的信息。编码和解码一个字符所需的时间与该字符的编码长度成正比,所以动态哈夫曼编码可实时进行。 2、哈夫曼译码 在通信中,若将字符用哈夫曼编码形式发送出去,对方接收到编码后,将编码还原成字符的过程,称为哈夫曼译码。 程序实现Basic VERSION 1.0 CLASS BEGIN MultiUse = -1 'True Persistable = 0 'NotPersistable DataBindingBehavior = 0 'vbNone DataSourceBehavior = 0 'vbNone MTSTransactionMode = 0 'NotAnMTSObject END Attribute VB_Name = "clsBinTree" Attribute VB_GlobalNameSpace = False Attribute VB_Creatable = True Attribute VB_PredeclaredId = False Attribute VB_Exposed = False Private Type BinTreeType LChild As Integer RChild As Integer Parents As Integer Power As Long End Type Private BinTree() As BinTreeType Public UB As Integer Public Sub InitData(Power As Long, LChild As Integer, RChild As Integer) If UB = 0 Then UB = 1 ReDim Preserve BinTree(1 To UB) BinTree(UB).Power = Power BinTree(UB).LChild = LChild BinTree(UB).RChild = RChild UB = UB + 1 End Sub Private Sub Min(Min1 As Integer, Min2 As Integer, ErrTag As Integer) Dim P As Integer, Q As Integer P = 1 While BinTree(P).Parents > 0 P = P + 1 Wend For I = P + 1 To UB - 1 If BinTree(I).Power < BinTree(P).Power And BinTree(I).Parents = 0 Then P = I Next Q = 1 While BinTree(Q).Parents > 0 Or Q = P Q = Q + 1 If Q = UB Then ErrTag = 1 Exit Sub End If Wend For I = Q + 1 To UB - 1 If BinTree(I).Power < BinTree(Q).Power And Q <> P And BinTree(I).Parents = 0 Then Q = I Next Min1 = P: Min2 = Q: ErrTag = 0 End Sub Public Sub Huffman() Dim ErrorHappen As Integer Dim Min1 As Integer, Min2 As Integer Dim I As Integer Min Min1, Min2, ErrorHappen Do Until ErrorHappen InitData BinTree(Min1).Power + BinTree(Min2).Power, Min1, Min2 BinTree(Min1).Parents = UB - 1 BinTree(Min2).Parents = UB - 1 Min Min1, Min2, ErrorHappen Loop End Sub Public Function HuffmanCode(I As Integer) As String HuffmanCode = I & ", LChild:" & BinTree(I).LChild & " RChild:" & BinTree(I).RChild & " Parents:" & BinTree(I).Parents & " Power:" & BinTree(I).Power End Function Attribute VB_Name = "Module1" Public HT As New clsBinTree Private Word As String, Ping() As Integer, PUB As Integer Public Sub CreatHT(Words As String) Dim Temp As String, I As Integer Word = "" For I = 1 To Len(Words) Temp = Mid(Words, I, 1) P = InStr(1, Word, Temp) If P > 0 Then Ping(P - 1) = Ping(P - 1) + 1 Else Word = Word & Temp ReDim Preserve Ping(PUB) Ping(PUB) = 1 PUB = PUB + 1 End If Next Form1.Text2.SelText = Word & vbCrLf For I = 0 To PUB - 1 Form1.Text2.SelText = Ping(I) & "," HT.InitData CLng(Ping(I)), 0, 0 Next Form1.Text2.SelText = vbCrLf & vbCrLf HT.Huffman For I = 1 To HT.UB - 1 Form1.Text2.SelText = HT.HuffmanCode(I) & vbCrLf Next End Sub PascalProgram huffman_tree(input,output); const max=32767;n=20;m=2*n-1 Type tnode=RECORD data:integer; Lc,Rc:integer; END; Var tree:ARRAY[0..m] of tnode; weight:ARRAY[0..n] of integer; im,num:integer; procedure initial; var i:integer; begin write('First input nun(<',n:2,')'); readln(num); writeln('Please input weight:'); for i:=0 to num-1 do read(weight[i]) end; function minimum:integer; var i:integer; begin min:=max; for i:=0 to num-1 do if (min>weight[i]) then begin min:=weight[i]; im:=i; end; weight[im]:=max; minimum:=min; end; procedure huffman; var i,k:integer; begin for i:=num to 2*num-1 do begin tree[i].Lc:=minimum; tree[i].Rc:=minimum; tree[i].data:=tree[i].Lc:+tree[i].Rc; weight[im]:=tree[i].data end; writeln; writeln('The result of huffman tree:'); k:=1; for i:=2*num-2 downto num do begin write(tree[i].data:6,':',tree[i].Lc:3,tree[i].Rc:3); if (k mod 3=0) then writeln; k:=k+1; end writeln end; procedure printd; var i:integer; begin write('The weight of tree:'); for i:=0 to num-1 do write{weight[i]:3} end; begin {main} initial; printd; huffman end. C++#include <iostream> #include <stdlib.h> using namespace std; const int MaxValue = 10000; //初始设定的权值最大值 const int MaxBit = 4; //初始设定的最大编码位数 const int MaxN = 10; //初始设定的最大结点个数 struct HaffNode //哈夫曼树的结点结构 { int weight; //权值 int flag; //标记 int parent; //双亲结点下标 int leftChild; //左孩子下标 int rightChild; //右孩子下标 }; struct Code //存放哈夫曼编码的数据元素结构 { int bit[MaxN]; //数组 int start; //编码的起始下标 int weight; //字符的权值 }; void Haffman(int weight[], int n, HaffNode haffTree[]) //建立叶结点个数为n权值为weight的哈夫曼树haffTree { int j, m1, m2, x1, x2; //哈夫曼树haffTree初始化。n个叶结点的哈夫曼树共有2n-1个结点 for(int i = 0; i < 2 * n - 1 ; i++) { if(i < n) haffTree[i].weight = weight[i]; else haffTree[i].weight = 0; haffTree[i].parent = 0; haffTree[i].flag = 0; haffTree[i].leftChild = -1; haffTree[i].rightChild = -1; } //构造哈夫曼树haffTree的n-1个非叶结点 for(int i = 0;i < n-1;i++) { m1 = m2 = MaxValue; x1 = x2 = 0; for(j = 0; j < n+i;j++) { if (haffTree[j].weight < m1 && haffTree[j].flag == 0) { m2 = m1; x2 = x1; m1 = haffTree[j].weight; x1 = j; } else if(haffTree[j].weight < m2 && haffTree[j].flag == 0) { m2 = haffTree[j].weight; x2 = j; } } //将找出的两棵权值最小的子树合并为一棵子树 haffTree[x1].parent = n+i; haffTree[x2].parent = n+i; haffTree[x1].flag = 1; haffTree[x2].flag = 1; haffTree[n+i].weight = haffTree[x1].weight+haffTree[x2].weight; haffTree[n+i].leftChild = x1; haffTree[n+i].rightChild = x2; } } void HaffmanCode(HaffNode haffTree[], int n, Code haffCode[]) //由n个结点的哈夫曼树haffTree构造哈夫曼编码haffCode { Code *cd = new Code; int child, parent; //求n个叶结点的哈夫曼编码 for(int i = 0; i < n; i++) { cd->start = n-1; //不等长编码的最后一位为n-1 cd->weight = haffTree[i].weight; //取得编码对应权值的字符 child = i; parent = haffTree[child].parent; //由叶结点向上直到根结点 while(parent != 0) { if(haffTree[parent].leftChild == child) cd->bit[cd->start] = 0; //左孩子结点编码0 else cd->bit[cd->start] = 1;//右孩子结点编码1 cd->start--; child = parent; parent = haffTree[child].parent; } //保存叶结点的编码和不等长编码的起始位 for(int j = cd->start+1; j < n; j++) haffCode[i].bit[j] = cd->bit[j]; haffCode[i].start = cd->start; haffCode[i].weight = cd->weight; //保存编码对应的权值 } } int main() { int i, j, n = 4; int weight[] = {1,3,5,7}; HaffNode *myHaffTree = new HaffNode[2*n+1]; Code *myHaffCode = new Code[n]; if(n > MaxN) { cout << "定义的n越界,修改MaxN! " << endl; exit(0); } Haffman(weight, n, myHaffTree); HaffmanCode(myHaffTree, n, myHaffCode); //输出每个叶结点的哈夫曼编码 for(i = 0; i < n; i++) { cout << "Weight = " << myHaffCode[i].weight << " Code = "; for(j = myHaffCode[i].start+1; j < n; j++) cout << myHaffCode[i].bit[j]; cout << endl; } return 0; } C#include<stdio.h> #include<malloc.h> #define N 8 typedef struct node{ int item; struct node* l; struct node* r; }* link; link *h; int Nq = N; link NODE(int item, link l, link r) { link t = malloc(sizeof *t); t->l = l; t->r = r; t->item = item; return t; } void shif_up(int i) { int j; while(i > 1) { j = i/2; if() } } void insert(link t) { h[++Nq] = t; shif_up(Nq); } link delmin() { swap(1, Nq--); shif_down(1,Nq); return h[Nq+1]; } link creat_heap(int freq, int len) { int i; for(i = 0; i < len; i++) h[i] = NODE(freq[i], NULL, NULL); for(i = N/2; i >= 0; i--) shif_down(i, N);} void huffman(int freq[], int len) { h = malloc(len * sizeof(link)); creat_heap(h, freq, len); while(Nq > 1) { link t1 = delmin(); link t2 = delmin(); insert(NODE(t1->item + t2->item, t1, t2)); } } int main(void) { int freq[N] = {5, 29, 7, 8, 14, 23, 3, 11}; huffman(freq, N); return 0; } |
随便看 |
百科全书收录4421916条中文百科知识,基本涵盖了大多数领域的百科知识,是一部内容开放、自由的电子版百科全书。