Jump to content

Image Compression

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Discrete Decoders (talk | contribs) at 14:49, 9 July 2022. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Redirect page

Redirect to:

  • From other capitalisation: This is a redirect from a title with another method of capitalisation. It leads to the title in accordance with the Wikipedia naming conventions for capitalisation, or it leads to a title that is associated in some way with the conventional capitalisation of this redirect title. This may help writing, searching and international language issues.
    • If this redirect is an incorrect capitalisation, then {{R from miscapitalisation}} should be used instead, and pages that use this link should be updated to link directly to the target. Miscapitalisations can be tagged in any namespace.
    • Use this rcat to tag only mainspace redirects; when other capitalisations are in other namespaces, use {{R from modification}} instead.

INTRODUCTION Our project aims to decrease the amount of data consumed by the images. With increase in use of social media and smartphones, the memory storage and transmission of data is augmenting. In today's world, where people are fond of taking pictures, we need a measure to reduce the image size in order to save the storage space. So, here we use Huffman's coding algorithm to compress the images to save space. David Albert Huffman, an American computer science pioneer best known for his Huffman coding, lived from August 9, 1925, until October 7, 1999. He was a pioneer in the area of mathematical origami as well. At Ohio State University, Huffman graduated with a bachelor's degree in electrical engineering in 1944. After that, he served in the US Navy for two years as an officer. He returned to Ohio State to earn his master's degree in electrical engineering in 1949. In 1953, he earned his Doctor of Science in electrical engineering at the Massachusetts Institute of Technology (MIT), with the thesis The Synthesis of Sequential Switching Circuits, advised by Samuel H. Caldwell.

PROBLEM STATEMENT: To do Image Compression

Theory There square measure numerous algorithms to compress files to smaller sizes some involve loss in info resulting in poor quality whereas some don't. we have a tendency to selected Huffman writing as a result of it's a lossless form of compression during which none of the initial info is lost, it's merely compressed.


David Albert Huffman invented Huffman coding in 1952.

In his article 'A Method for the Construction of Minimum-Redundancy Codes,' he describes this approach. Each symbol is given a code of varying length. The shorter codes are assigned to symbols that appear more frequently, whereas the longest codes are applied to symbols that appear less frequently.


The variable-length codes assigned to input symbols are Prefix Codes, which means that the code assigned to one symbol is not the same as the prefix of code assigned to any other letter. This ensures that when decoding, there is no confusion between two symbols. A Binary tree can be used to create these Prefix codes, for example :


When we move to the left child of the root in this tree, we count it as 1 and when we move to the right child of the root, we count it as 0. We can then assign a node to a symbol to obtain its codeword. The children of that node can no longer be used. All of the symbols must have nodes assigned to them. We can assign nodes to symbols in a variety of ways, therefore we need to figure out which option is the most efficient. Large The rate R, which describes the average amount of expected bits to represent a symbol, can be used to calculate the efficiency of all the different situations of the binary tree. Which one is it ?


The probability of the symbol is denoted by p_i, while the number of bits in the codeword assigned to the symbol is denoted by n_i. Entropy H(X), which is defined as the amount of information carried by each individual bit, can also be used to calculate efficiency. Which is supplied by :


We may now compress the file using these codes after giving each symbol a code. However, the table containing the original symbols and the codes assigned to them should be present when decoding the compressed file. The decoder can now decode the compressed text and read the table.

Because we want to compress photos, we need to know how they are stored. A picture is made up of many tiny pixels, each of which is represented by a Binary value. The term "bit plane" refers to this type of colour representation. Each bit doubles the amount of colours accessible. A single bit colour, for example, can only have two colours: black and white. White gets a 0 while black gets a 1. Scan lines are used to store images. Each line is encoded from top to bottom, from left to right.

We'll look at and explain the algorithm using a 5x5 image with a 3 bit size and the following pixel representation:

Algorythm First of all create a table that includes three pieces of information: the symbol, its probability, and the frequency with which it appears in the given image. The formula for calculating pixel probability is as follows :


Here, num_pixel is the number of times that pixel appeared in the image and total_pixel is the total amount of pixels in the image.The table is as shown below :


Step :- 2

Sort the table's elements in descending order based on the likelihood of occurrence. Throughout the process, keep in mind that the table should be in decreasing order. The following is a table in descending order:


Step :- 3

Now the probabilities of the last two symbols, i.e. the two least probabilities, add up to form a new pseudo node that connects both the left and right initial values as its children. Even with different iterations, the sum of probability is always equal to 1. The initial iteration is displayed below :


Step :- 4

Repeat steps 2 and 3 until there is only one node remaining, which is the root node, or until we receive 1. Keep in mind that if there is no relationship between the present tree and a tree formed later, a new tree will be created, and the beauty of the algorithm is that they will find a common ancestor someplace.

Step :- 5

After the tree has been generated, the next step is to assign the code-word. According to Huffman Coding, the symbol with the highest probability should have the lowest code. As a result, we consider assigning 0 to either the left or right side of the root, and 1 to the other.

Step :- 6

Further we number all the edges of the tree and it’s subtrees but remember to not change the order of 0 and 1 that is if left side is assigned with 0 then for all the nodes left should be assigned to 0 and likewise for 1.

Step :- 7

Now, starting from the leaf nodes, backtrace all of the new keywords that will be used for decoding and encoding. Create a key table that includes all symbols as well as the codes that must be exchanged for decoding.


The image encoded string and the key table for decoding purposes would be included in the data size.

Using this given formula, we can compute the compression percentage.

And we are done, we reached our desired product.

Solution There square measure numerous algorithms to compress files to smaller sizes. Some involve loss in info resulting in poor quality whereas some don't. We tend to select Huffman writing as a result of its lossless form of compression during which none of the initial info is lost, it's merely compressed David Albert Huffman invented Huffman coding in 1952.

In his article 'A Method for the Construction of Minimum-Redundancy Codes,' he describes this approach. Each symbol is given a code of varying length. The shorter codes are assigned to symbols that appear more frequently, whereas the longest codes are applied to symbols that appear less frequently.

The variable-length codes assigned to input symbols are Prefix Codes, which means that the code assigned to one symbol is not the same as the prefix of code assigned to any other letter. This ensures that when decoding, there is no confusion between two symbols. A Binary tree can be used to create these Prefix codes, for example:

When we move to the left child of the root in this tree, we count it as 1 and when we move to the right child of the root, we count it as 0. We can then assign a node to a symbol to obtain its codeword. The children of that node can no longer be used. All of the symbols must have nodes assigned to them. We can assign nodes to symbols in a variety of ways, therefore we need to figure out which option is the most efficient.

Large The rate R, which describes the average amount of expected bits to represent a symbol, can be used to calculate the efficiency of all the different situations of the binary tree. Which one is it?

The probability of the symbol is denoted by pl, while the number of bits in the codeword assigned to the symbol is denoted by nl. Entropy H(X), which is defined as the amount of information carried by each individual bit, can also be used to calculate efficiency. Which is supplied by:

We may now compress the file using these codes after giving each symbol a code. However, the table containing the original symbols and the codes assigned to them should be present when decoding the compressed file. The decoder can now decode the compressed text and read the table.

Because we want to compress photos, we need to know how they are stored. A picture is made up of many tiny pixels, each of which is represented by a Binary value. The term "bit plane" refers to this type of colour representation. Each bit doubles the amount of colours accessible. A single bit colour, for example, can only have two colours: black and white. White gets a 0 while black gets a 1. Scan lines are used to store images. Each line is encoded from top to bottom, from left to right.

Code

  1. include <iostream>
  1. include <string>
  1. include <queue>
  1. include <unordered_map>

using namespace std;


  1. define EMPTY_STRING ""

struct Node

{

   char ch;
   int frequency;
   Node *left, *right;

};

Node* getNode(char ch, int frequency, Node* left,

Node* right)

{

   Node* node = new Node();
   node->ch = ch;
   node->frequency = frequency;
   node->left = left;
   node->right = right;
   return node;

}

struct comp

{

   bool operator()(const Node* l, const Node* r) 
   const
   {

// the highest priority item has the lowest frequency

       return l->frequency > r->frequency;
   }

};

//to check if Huffman Tree contains only a single node

bool vaildLeaf(Node* root) {

   return root->left == nullptr && root->right == nullptr;

}

void encode(Node* root, string str, unordered_map<char,

string> &huffmanCode)

{

   if (root == nullptr) {
       return;
   }
   if (vaildLeaf(root)) {
       huffmanCode[root->ch] = (str != EMPTY_STRING) ? 
       str : "1";
   }
   encode(root->left, str + "0", huffmanCode);
   encode(root->right, str + "1", huffmanCode);

}

void decode(Node* root, int &index, string str)

{

   if (root == nullptr) {
       return;
   }
   if (vaildLeaf(root))
   {
       cout << root->ch;
       return;
   }
   index++;
   if (str[index] == '0') {
       decode(root->left, index, str);
   }
   else {
       decode(root->right, index, str);
   }

}

void HuffmanTree(string symbols)

{

   if (symbols == EMPTY_STRING) {
       return;
   }
   unordered_map<char, int> frequency;
   for (char ch: symbols) {
       frequency[ch]++;
   }
   priority_queue<Node*, vector<Node*>, comp> pq;
   for (auto pair: frequency) {
       pq.push(getNode(pair.first, pair.second, nullptr, 
       nullptr));
   }
   while (pq.size() != 1)
   {
       Node* left = pq.top(); pq.pop();
       Node* right = pq.top();    pq.pop();
       int sum = left->frequency + right->frequency;
       pq.push(getNode('\0', sum, left, right));
   }
   Node* root = pq.top();
   unordered_map<char, string> huffmanCode;
   encode(root, EMPTY_STRING, huffmanCode);


   cout << "Huffman Codes are:\n" << endl;
   cout<<"=========="<<endl;
   for (auto pair: huffmanCode) {
       cout<<pair.first << " --> "<<pair.second<<endl;
   }
   cout<<"=========="<<endl;
   cout<<"\nThe original string is:\n"<<symbols<< endl;


   // Print encoded string
   string str;
   for (char ch: symbols) {
       str += huffmanCode[ch];
   }


   cout << "\nThe encoded string is:\n" << str << endl;
   cout << "\nThe decoded string is:\n";


   if (vaildLeaf(root))
   {
       // Special case: For input like a, aa, aaa, etc.
       while (root->frequency--) {
           cout << root->ch;
       }
   }
   else {
       int index = -1;
       while (index < (int)str.size() - 1) {
           decode(root, index, str);
       }
   }

}


int main()

{

   string symbols = "Test runs are mandatory to ensure 

that the code works.";

   HuffmanTree(symbols);


   return 0;

}

Link to our project: Click here

Link to our Web site:Click here

Link to our youtube video: Click here