Data Compression and Decompression Algorithms

Table of contents

INTRODUCTION

Data compression is a common requirement for most of the computerized applications. There are a number of data compression algorithms, which are dedicated to compressing different data formats. Even for a single data type, there are a number of different compression algorithms, which use different approaches. This paper examines lossless data compression algorithms.

1. DATA COMPRESSION: In computer science, data compression involves encoding information using fewer bits than the original representation. Compression is useful because it helps reduce the consumption of resources such as data space or transmission capacity. Because compressed data must be decompressed to be used, this extra processing imposes computational or other costs through decompression.

Get your 100% original paper on any topic done
in as little as 4 hours

Write My Essay

1. 1 Classification of Compression:

a) Static/non-adaptive compression.
b) Dynamic/adaptive compression.
c) Static/Non-adaptive. Compression: A static method is one in which the mapping from the set of messages to the set of codewords is fixed before transmission begins so that a given message is represented by the same codeword every time it appears in the message ensemble. The classic static defined-word scheme is Huffman coding.
d) Dynamic/adaptive compression: A code is dynamic if the mapping from the set of messages to the set of codewords changes over time.

2. 2 Data Compression Methods:

a) Lossless Compression: Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in Lossless compression is possible because most real-world data has statistical redundancy. For example, an image may have areas of color that do not change over several pixels; instead of coding “red pixel, red pixel, … the data may be encoded as “279 red pixels”. Lossless compression is used in cases where it is important that the original and the decompressed data be identical, or where deviations from the original data could be deleterious. Typical examples are executable programs, text documents, and source code. Some image file formats, like PNG or GIF, use only lossless compression
b) Lossy Compression: In information technology, lossy compression is a data encoding method that compresses data by discarding (losing) some of it. The procedure aims to minimize the amount of data that needs to be held, handled, and/or transmitted by a computer. Lossy compression is most commonly used to compress multimedia data (audio, video, and still images), especially in applications such as streaming media and internet telephony. If we take a photo of a sunset over the sea, for example, there are going to be groups of pixels with the same color value, which can be reduced. Lossy algorithms tend to be more complex, as a result, they achieve better results for bitmaps and can accommodate for the loss of data. The compressed file is an estimation of the original data. One of the disadvantages of lossy compression is that if the compressed file keeps being compressed, then the quality will be degraded drastically.

3. Lossless Compression Algorithms: Run-Length Encoding(RLE): RLE stands for Run Length Encoding. It is a lossless algorithm that only offers decent compression ratios in specific types of data. How RLE works: RLE is probably the easiest compression algorithm. It replaces sequences of the same data values within a file by a count number and a single value. It is important to know that there are many different run-length encoding schemes. The above example has just been used to demonstrate the basic principle of RLE encoding. Sometimes the implementation of RLE is adapted to the type of data that is being compressed.

4. Complexity and Data Compression: We’re used to talking about the complexity of an algorithm measuring time and we usually try to find the fastest implementation, like in search algorithms. Here it is not so important to compress data quickly but to compress as much as possible so the output is as small as possible without losing data. A great feature of run-length encoding is that this algorithm is easy to implement.

5. Advantages and disadvantages: This algorithm is very easy to implement and does not require much CPU horsepower. RLE compression is only efficient with files that contain lots of repetitive data. These can be text files if they contain lots of spaces for indenting but line-art images that contain large white or black areas are far more suitable. Computer-generated color images (e. g. architectural drawings) can also give fair compression ratios. Where is RLE compression used? RLE compression can be used in the following file formats: PDF files

6. HUFFMAN CODING: Huffman coding is a popular method for compressing data with variable-length codes. Given a set of data symbols (an alphabet) and their frequencies of occurrence (or, equivalently, their probabilities), the method constructs a set of variable-length codewords with the shortest average length and assigns them to the symbols. Huffman coding serves as the basis for several applications implemented on popular platforms. Some programs use just the Huffman method, while others use it as one step in a multistep compression process.

7. Huffman Encoding: The Huffman encoding algorithm starts by constructing a list of all the alphabet symbols in descending order of their probabilities. It then constructs, from the bottom up, a binary tree with a symbol at every leaf. This is done in steps, where at each step two symbols with the smallest probabilities are selected, added to the top of the partial tree, deleted from the list, and replaced with an auxiliary symbol representing the two original symbols. When the list is reduced to just one auxiliary symbol (representing the entire alphabet), the tree is complete. The tree is then traversed to determine the codewords of the symbols. BCA is in the Dictionary. BCAA is not in the Dictionary; insert it.

8. B is in the Dictionary. BC is in the Dictionary. BCA is in the Dictionary. BCAA is in the Dictionary. BCAAB is not in the Dictionary; insert it. LZ78 Compression : No of bits transmitted: Uncompressed String: ABBCBCABABCAABCAAB

Number of bits = Total number of characters * 8 = 18 * 8 = 144 bits

Suppose the codewords are indexed starting from 1:

Compressed string( codewords): (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B)

Codeword index 1 2 3 4 5 6 7.

Each code word consists of an integer and a character:

The character is represented by 8 bits. The number of bits n required to represent the integer part of the codeword with index i is given by:

Codeword (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B) index 1 2 3 4 5 6 7

Bits: (1 + 8) + (1 + 8) + (2 + 8) + (2 + 8) + (3 + 8) + (3 + 8) + (3 + 8) = 71 bits

The actual compressed message is: 0A0B10C11A010A100A110B

9. Decompression Algorithm: Dictionary empty

Published by Terry Welch in 1984it basically applies the LZSS principle of not explicitly transmitting the next nonmatching symbol to the LZ78 algorithm. The only remaining output of this improved algorithm is fixed-length references to the dictionary (indexes). If the message to be encoded consists of only one character, LZW outputs the code for this character; otherwise, it inserts two- or multi-character, overlapping, distinct patterns of the message to be encoded in a Dictionary. Overlapping: The last character of a pattern is the first character of the next pattern.

10. Algorithm:

Initialize Dictionary with 256 single character strings and their corresponding ASCII codes; Prefix first input character; CodeWord 256; while(not end of character stream){ Char next input character; if(Prefix + Char exists in the Dictionary) Prefix Prefix + Char; else{ Output: the code for Prefix; insertInDictionary( (CodeWord , Prefix + Char) ) ; CodeWord++; Prefix Char; } } Output: the code for Prefix; Example : Compression using LZW Encode the string BABAABAAA by the LZW encoding algorithm. 1. BA is not in the Dictionary; insert BA, output the code for its prefix: code(B) 2.

AB is not in the Dictionary; insert AB, output the code for its prefix: code(A) 3. BA is in the Dictionary. BAA is not in Dictionary; insert BAA, output the code for its prefix: code(BA) 4. AB is in the Dictionary. ABA is not in the Dictionary; insert ABA, output the code for its prefix: code(AB) 5. AA is not in the Dictionary; insert AA, output the code for its prefix: code(A) 6. AA is in the Dictionary and it is the last pattern; output its code: code(AA) Compressed message: The compressed message is: <66><65><256><257><65><260> LZW: Number of bits transmitted

11. Decoding algorithm: Initialize Dictionary with 256 ASCII codes and corresponding single character strings as their translations; PreviousCodeWord first input code; Output: string(PreviousCodeWord) ;

Char character(first input code); CodeWord 256; while(not end of code stream){ CurrentCodeWord next input code ; if(CurrentCodeWord exists in the Dictionary) String string(CurrentCodeWord) ; else String string(PreviousCodeWord) + Char ; Output: String; Char first character of String ; insertInDictionary( (CodeWord , string(PreviousCodeWord) + Char ) ); PreviousCodeWord CurrentCodeWord ; CodeWord++ ; } Summary of LZW decoding algorithm: output: string(first CodeWord); while(there are more CodeWords){ if(CurrentCodeWord is in the Dictionary) output: string(CurrentCodeWord); else utput: PreviousOutput + PreviousOutput first character; insert in the Dictionary: PreviousOutput + CurrentOutput first character; } Example : LZW Decompression Use LZW to decompress the output sequence <66> <65> <256> <257> <65> <260> 1. 66 is in Dictionary; output string(66) i. e. B 2. 65 is in Dictionary; output string(65) i. e. A, insert BA 3. 256 is in Dictionary; output string(256) i. e. BA, insert AB 4. 257 is in Dictionary; output string(257) i. e. AB, insert BAA 5. 65 is in Dictionary; output string(65) i. e. A, insert ABA 6. 60 is not in Dictionary; output previous output + previous output first character: AA, insert AA

Reference

http://www.sqa.org.uk/e-learning/BitVect01CD/page_86.htm
http://www.gukewen.sdu.edu.cn/panrj/courses/mm08.pdf
http://www.cs.cmu.edu/~guyb/realworld/compression.pdf
http://www.stoimen.com/blog/2012/01/09/computer-algorithms-data-compression-with-run-length-encoding/
http://www.ics.uci.edu/~dan/pubs/DC-Sec1.html#Sec_1
http://www.prepressure.com/library/compression_algorithms/flatedeflate
http://en.wikipedia.org/wiki/Data_compression

We will write a custom Essay on Data Compression and Decompression Algorithms specifically for you for only ~~$16.05~~ $13/page

805 certified writers online

Order Now

Need help with your Assignment?

Give us your paper requirements,and we’ll deliver the highest-quality essay at only $13 a page.

Order with discount

Calculate the price

Make an order in advance and get the best price

Type of paper

Academic level

Deadline

Pages (550 words)

$0.00

*Price with a welcome 15% discount applied.

Pro tip: If you want to save more money and pay the lowest price, you need to set a more extended deadline.

We know how difficult it is to be a student these days. That's why our prices are one of the most affordable on the market, and there are no hidden fees.

Instead, we offer bonuses, discounts, and free services to make your experience outstanding.

How it works

Receive a 100% original paper that will pass Turnitin from a top essay writing service

step 1

Upload your instructions

Fill out the order form and provide paper details. You can even attach screenshots or add additional instructions later. If something is not clear or missing, the writer will contact you for clarification.

step 2

Control the process

Once you place an order with our professional essay writing services, we will email you login details to your account. There, you'll communicate with the writer and support team and track the writer's progress.

step 3

Download your paper on time

As soon as your work is ready, we’ll notify you via email. You'll then be able to download it from your account and request a revision if needed. Please note that you can also rate the writer's work in your account.

Pro service tips

How to get the most out of your experience with MyStudyWriters

One writer throughout the entire course

If you like the writer, you can hire them again. Just copy & paste their ID on the order form ("Preferred Writer's ID" field). This way, your vocabulary will be uniform, and the writer will be aware of your needs.

The same paper from different writers

You can order essay or any other work from two different writers to choose the best one or give another version to a friend. This can be done through the add-on "Same paper from another writer."

Copy of sources used by the writer

Our college essay writers work with ScienceDirect and other databases. They can send you articles or materials used in PDF or through screenshots. Just tick the "Copy of sources" field on the order form.

Testimonials

See why 20k+ students have chosen us as their sole writing assistance provider

Check out the latest reviews and opinions submitted by real customers worldwide and make an informed decision.

Business and administrative studies

excellent work

Customer 452773, March 9th, 2023

fin571

EXCELLEN T

Customer 452773, March 21st, 2024

Leadership Studies

excellent job

Customer 452773, August 3rd, 2023

10th grade English

very good

Customer 452773, March 26th, 2023

Business and administrative studies

Excellent job

Customer 452773, March 9th, 2023

History

Looks great and appreciate the help.

Customer 452675, April 26th, 2021

Leadership Studies

awesome work as always

Customer 452773, August 19th, 2023

Business and administrative studies

always perfect work and always completed early

Customer 452773, February 21st, 2023

Criminal Justice

The paper was not accused of plagiarism and was written very well. I will let you know the grade once it is graded. Thank you

Customer 452671, April 26th, 2021

Leadership Studies

excellent job

Customer 452773, July 28th, 2023

Criminal Justice

This has been the greatest help while I am recovering from an illness. Thank your team so much.

Customer 452671, May 2nd, 2021

English 101

IThank you

Customer 452631, April 6th, 2021

11,595

Customer reviews in total

96%

Current satisfaction rate

3 pages

Average paper length

37%

Customers referred by a friend

OUR GIFT TO YOU

15% OFF your first order

Use a coupon FIRST15 and enjoy expert help with any task at the most affordable price.

Claim my 15% OFF Order in Chat

Check the price of your paper

Data Compression and Decompression Algorithms

INTRODUCTION

Reference

Share this:

Related

Need help with your Assignment?

Sometimes it is hard to do all the work on your own