Skip to content

komm.VariableToFixedCode

General variable-to-fixed length code. A variable-to-fixed length code with target alphabet $\mathcal{Y}$, source alphabet $\mathcal{X}$, and target block size $n$ is defined by a (possibly partial) decoding mapping $\mathrm{Dec}: \mathcal{Y}^n \rightharpoonup \mathcal{X}^+$, where the domain is the set of all $n$-tuples with entries in $\mathcal{Y}$, and the co-domain is the set of all finite-length, non-empty tuples with entries in $\mathcal{X}$. The elements in the image of $\mathrm{Dec}$ are called sourcewords.

Note

Here, for simplicity, we assume that the source alphabet is $\mathcal{X} = [0 : |\mathcal{X}|)$ and the target alphabet is $\mathcal{Y} = [0 : |\mathcal{Y}|)$, where $|\mathcal{X}| \geq 2$ and $|\mathcal{Y}| \geq 2$ are called the source cardinality and target cardinality, respectively.

from_dec_mapping() classmethod

Constructs a variable-to-fixed length code from the decoding mapping $\Dec$.

Parameters:

  • dec_mapping (dict[Word, Word])

    The decoding mapping $\Dec$. Must be a dictionary of length at most $|\mathcal{Y}|^n$ whose keys are $n$-tuples of integers in $\mathcal{Y}$ and whose values are non-empty tuples of integers in $\mathcal{X}$.

Notes

The target block size $n$ is inferred from the domain of the decoding mapping, and the target and source cardinalities $|\mathcal{Y}|$ and $|\mathcal{X}|$ are inferred from the maximum values in the domain and co-domain, respectively.

Examples:

>>> code = komm.VariableToFixedCode.from_dec_mapping({
...     (0, 0): (0, 0, 0),
...     (0, 1): (0, 0, 1),
...     (1, 0): (0, 1),
...     (1, 1): (1,),
... })
>>> code.target_cardinality, code.source_cardinality, code.target_block_size
(2, 2, 2)
>>> code.sourcewords
[(0, 0, 0), (0, 0, 1), (0, 1), (1,)]
>>> code = komm.VariableToFixedCode.from_dec_mapping({
...     (0, 0, 0): (1, ),
...     (0, 0, 1): (2, ),
...     (0, 1, 0): (0, 1),
...     (0, 1, 1): (0, 2),
...     (1, 0, 0): (0, 0, 0),
...     (1, 0, 1): (0, 0, 1),
...     (1, 1, 0): (0, 0, 2),
... })  # Incomplete mapping
>>> code.target_cardinality, code.source_cardinality, code.target_block_size
(2, 3, 3)
>>> code.sourcewords
[(1,), (2,), (0, 1), (0, 2), (0, 0, 0), (0, 0, 1), (0, 0, 2)]

from_sourcewords() classmethod

Constructs a variable-to-fixed length code from the target cardinality $|\mathcal{Y}|$ and a list of sourcewords.

Parameters:

  • target_cardinality (int)

    The target cardinality $|\mathcal{Y}|$. Must be an integer greater than or equal to $2$.

  • sourcewords (list[Word])

    The sourcewords of the code. Must be a list of length at most $|\mathcal{Y}|^n$ containing tuples of integers in $\mathcal{X}$. The tuple in position $i$ must be equal to $\mathrm{Dec}(\mathbf{y})$, where $\mathbf{y}$ is the $i$-th element in the lexicographic ordering of $\mathcal{Y}^n$.

Note

The target block size $n$ is inferred from the length of the sourcewords, and the source cardinality $|\mathcal{X}|$ is inferred from the maximum value in the sourcewords.

Examples:

>>> code = komm.VariableToFixedCode.from_sourcewords(
...     target_cardinality=2,
...     sourcewords=[(0, 0, 0), (0, 0, 1), (0, 1), (1,)],
... )
>>> code.target_cardinality, code.source_cardinality, code.target_block_size
(2, 2, 2)
>>> code.dec_mapping
{(0, 0): (0, 0, 0),
 (0, 1): (0, 0, 1),
 (1, 0): (0, 1),
 (1, 1): (1,)}
>>> code = komm.VariableToFixedCode.from_sourcewords(
...     target_cardinality=2,
...     sourcewords=[
...         (1,), (2,), (0, 1), (0, 2), (0, 0, 0), (0, 0, 1), (0, 0, 2)
...     ],
... )
>>> code.target_cardinality, code.source_cardinality, code.target_block_size
(2, 3, 3)
>>> code.dec_mapping
{(0, 0, 0): (1,),
 (0, 0, 1): (2,),
 (0, 1, 0): (0, 1),
 (0, 1, 1): (0, 2),
 (1, 0, 0): (0, 0, 0),
 (1, 0, 1): (0, 0, 1),
 (1, 1, 0): (0, 0, 2)}

target_cardinality int cached property

The target cardinality $|\mathcal{Y}|$ of the code. It is the number of symbols in the target alphabet.

Examples:

>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0, 1)])
>>> code.target_cardinality
2

source_cardinality int cached property

The source cardinality $|\mathcal{X}|$ of the code. It is the number of symbols in the source alphabet.

Examples:

>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0, 1)])
>>> code.source_cardinality
2

target_block_size int cached property

The target block size $n$ of the code. It is the number of symbols in each target block.

Examples:

>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0, 1)])
>>> code.target_block_size
2

size int cached property

The number of sourcewords in the code. It is less than or equal to $|\mathcal{Y}|^n$.

Examples:

>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0, 1)])
>>> code.size
3

dec_mapping dict[Word, Word] cached property

The decoding mapping $\mathrm{Dec}$ of the code. It is a dictionary of length at most $|\mathcal{Y}|^n$ whose keys are $n$-tuples of integers in $\mathcal{Y}$ and whose values are the corresponding sourcewords.

Examples:

>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0, 1)])
>>> code.dec_mapping
{(0, 0): (0,), (0, 1): (1,), (1, 0): (0, 1)}

sourcewords list[Word] cached property

The sourcewords of the code. They correspond to the image of the decoding mapping $\mathrm{Dec}$.

Examples:

>>> code = komm.VariableToFixedCode.from_dec_mapping({
...     (0, 0): (0,),
...     (0, 1): (1,),
...     (1, 0): (0, 1),
... })
>>> code.sourcewords
[(0,), (1,), (0, 1)]

is_fully_covering() cached

Returns whether the code is fully covering. A code is fully covering if every possible source sequence has a prefix that is a sourceword.

Examples:

>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1, 0), (1, 1)])
>>> code.is_fully_covering()
True
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0, 1)])
>>> code.is_fully_covering()
True
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0, 0), (0, 1)])
>>> code.is_fully_covering()  # (1,) is not covered
False
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (0, 1)])
>>> code.is_fully_covering()  # (1,) is not covered
False

is_uniquely_encodable() cached

Returns whether the code is uniquely encodable. A code is uniquely encodable if there is a unique way to parse any concatenation of sourcewords.

Examples:

>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1, 0), (1, 1)])
>>> code.is_uniquely_encodable()
True
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0, 1)])
>>> code.is_uniquely_encodable()  # 01 can be parsed as 0|1 or 01
False
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0, 0), (0, 1)])
>>> code.is_uniquely_encodable()
True
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (0, 1)])
>>> code.is_uniquely_encodable()
True

is_prefix_free() cached

Returns whether the code is prefix-free. A code is prefix-free if no sourceword is a prefix of any other sourceword.

Examples:

>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1, 0), (1, 1)])
>>> code.is_prefix_free()
True
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0, 1)])
>>> code.is_prefix_free()
False
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0, 0), (0, 1)])
>>> code.is_prefix_free()
True
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (0, 1)])
>>> code.is_prefix_free()
False

rate()

Computes the expected rate $R$ of the code, considering a given (first-order) pmf $p$ over $\mathcal{X}$. This quantity is given by $$ R = \frac{n}{\bar{k}}, $$ and $\bar{k}$ is the expected sourceword length, assuming iid source symbols drawn from $p$. It is measured in $|\mathcal{Y}|$-ary digits per source symbol.

Parameters:

  • pmf (ArrayLike)

    The pmf $p$ to be considered. It must be a one-dimensional array of floats of size $|\mathcal{X}|$. The elements must be non-negative and sum to $1$.

Returns:

  • rate (floating)

    The expected rate $R$ of the code.

Examples:

>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0, 1)])
>>> code.rate([2/3, 1/3])
np.float64(1.3846153846153846)

encode()

Encodes a sequence of source symbols using the code, which must be fully covering and uniquely encodable. When the input sequence ends with symbols that form only a partial match with any sourceword, the encoder will complete this last block using any valid sourceword that starts with these remaining symbols.

Warning

Encoding for non-prefix-free codes is not implemented yet.

Parameters:

  • input (ArrayLike)

    The sequence of symbols to be encoded. Must be a 1D-array with elements in $\mathcal{X}$.

Returns:

  • output (Array1D[integer])

    The sequence of encoded symbols. It is a 1D-array with elements in $\mathcal{Y}$ with a length that is a multiple of $n$.

Examples:

>>> code = komm.VariableToFixedCode.from_dec_mapping({
...     (0, 0): (0, 0, 0),
...     (0, 1): (0, 0, 1),
...     (1, 0): (0, 1),
...     (1, 1): (1,),
... })
>>> code.encode([1, 0, 0, 0])  # Parsed as 1|000
array([1, 1, 0, 0])
>>> code.encode([1, 0, 0])  # Incomplete input, completed as 1|000
array([1, 1, 0, 0])
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0, 0), (0, 1)])
>>> code.encode([1, 0, 0, 0])  # Code is not fully covering
Traceback (most recent call last):
...
ValueError: code is not fully covering
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0, 1)])
>>> code.encode([1, 0, 0, 0])  # Code is not uniquely encodable
Traceback (most recent call last):
...
ValueError: code is not uniquely encodable

decode()

Decodes a sequence of target symbols using the code, which must be fully covering and uniquely encodable.

Parameters:

  • input (ArrayLike)

    The sequence of symbols to be decoded. Must be a 1D-array with elements in $\mathcal{Y}$ and have a length that is a multiple of $n$. Also, the sequence must be a concatenation of target words (i.e., the output of the encode method).

Returns:

  • Array1D[integer]

    The sequence of decoded symbols. It is a 1D-array with elements in $\mathcal{X}$.

Examples:

>>> code = komm.VariableToFixedCode.from_dec_mapping({
...     (0, 0): (0,),
...     (0, 1): (1,),
...     (1, 0): (2,),
... })
>>> code.decode([0, 0, 1, 0])
array([0, 2])
>>> code.decode([1, 1, 0, 0, 1])  # Not a multiple of target block size
Traceback (most recent call last):
...
ValueError: length of input must be a multiple of block size 2 (got 5)
>>> code.decode([0, 0, 1, 1])  # 11 is not a valid target word
Traceback (most recent call last):
...
ValueError: input contains invalid word
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0, 0), (0, 1)])
>>> code.decode([0, 0, 0, 1])  # Code is not fully covering
Traceback (most recent call last):
...
ValueError: code is not fully covering
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0, 1)])
>>> code.decode([0, 0, 0, 1])  # Code is not uniquely encodable
Traceback (most recent call last):
...
ValueError: code is not uniquely encodable