Description
An entropy encoder is a data encoding method that achieves lossless data compression by encoding a message with "wasted" or "extra" information removed. In other words, entropy encoding removes information that was not necessary in the first place to accurately encode the message. A high degree of entropy implies a message with a great deal of wasted information; english text encoded in ASCII is an example of a message type that has very high entropy. Already compressed messages, such as JPEG graphics or ZIP archives, have very little entropy and do not benefit from further attempts at entropy encoding. …………………………
题意是输入一字符串,然后进行Huffman编码,输出原字符串所占的长度,编码后所占的长度,以及两个长度之间的比值。
Huffman编码的思想就是贪心,我们这里使用stl里的优先队列,priority_queue使用堆进行优化,虽然自己也可以写一个堆,但我感觉对于这道题有点主次不分了,再次感觉到stl确实是一个很强大的东西。
解题代码
//ZOJ1117 POJ1521 HDU1053 Huffman编码 #include <stdio.h> #include <queue> #include <string.h> #include <vector> using namespace std; char str[500]; int cnt[260]; struct cmp { bool operator()(int a, int b) { return a > b; } }; /*我们可以直接priority_queue<int> 这样它默认的是使用vector<int> 并且是大的优先 这里我们刚好相反,用小的优先,所以要自定义优先级cmp, 这个也可以使用类来实现 class cmp { public: bool operator() (int a, int b) { return a > b; } }; */ priority_queue <int, vector<int>, cmp> q; int main() { while (scanf("%s",str) && strcmp(str,"END")) { int l = strlen(str); memset(cnt, 0, sizeof(cnt)); //我们不需要清空队列,因为每次操作完后它都被清空了 for (int i = 0; i < l; i++) { cnt[str[i]]++; } for (int i = 0; i < 260; i++) { if (cnt[i]) { q.push(cnt[i]); } } int nl = 0; if (q.size() == 1) nl = l; while (q.size() > 1) { int a = q.top(); q.pop(); int b = q.top(); q.pop(); q.push(a+b); nl += (a+b); //我想看到这也许会有些疑惑,为什么要加(a+b),稍微思考一下就明白了 } q.pop(); printf("%d %d %.1lf\n",8*l, nl, (double)(8*l)/(double)nl); } return 0; }