IIT Lecture Note 1

Introduction to Information Theory, Compression, and Communicating over a noisy channel

What is Information Theory?

There are three central questions:

  1. How can we measure information?
    • Measure with bit, entropy, machine learning
    • Information is often quantified using entropy, that measure the unpredictability or randomness in a data source. The higher the entropy, the more “surprise” or uncertainty there is in the source, which correlates to more information
    • Related: How to ask the most informative questions.
  2. How can we compress a data source?
    • Lossless: ZIP, GIF, FLAC.
    • Lossy: JPG, MP3, MP4.
    • Compression leverages patterns and redundancies in data to reduce its size. Lossless methods like Huffman coding and arithmetic coding optimise symbol representation while lossy methods (e.g. JPEG) reduce data fidelity for more significant compression. Shannon’s source coding theorem provides a limit, stating that the minimum average length of encoding for a source cannot be less than its entropy.
  3. How can we reliably send information over noisy channels?
    • Reliable communication over noisy channels is addressed through error-correcting codes. Shannon’s noisy-channel coding theorem shows it’s possible to transmit data with an arbitrarily low error rate if the transmission rate is below the channel’s capacity. Codes like Reed-Solomon or Hamming codes detect and correct errors, ensuring data integrity.

Compression

Compression is the process of encoding information using fewer bits than the original representation. The goal is to reduce redundancy in data to save storage space or transmission time, without losing essential information.

Lossless Compression: This type of compression allows data to be reconstructed perfectly from the compressed form. Techniques include Huffman coding, Run-Length Encoding, and Lempel-Ziv (LZ) algorithms, as seen in file formats like PNG and ZIP.

Lossy Compression: In contrast, lossy compression reduces file size by permanently discarding some information, which results in an approximation of the original data. This is common in multimedia formats like JPEG and MP3, where a perfect reconstruction is unnecessary for human perception.

Let:

  • A={t,u,v,w}\mathcal{A}=\{t, u, v, w\} be a set of possible messages and;
  • code word c{0,1}lc\in \{0,1\}^{l} be binary string of length ll.
Posted on

2025-09-29

Updated on

2025-09-29

Licensed under

Comments