komm.VariableToFixedCode
Variable-to-fixed length code. A variable-to-fixed length code with target alphabet $\mathcal{T}$, source alphabet $\mathcal{S}$, and target block size $n$ is defined by a (possibly partial) injective decoding mapping $\mathrm{Dec} : \mathcal{T}^n \to \mathcal{S}^+$, where the domain is the set of all $n$-tuples with entries in $\mathcal{T}$, and the co-domain is the set of all finite-length, non-empty tuples with entries in $\mathcal{S}$. Here, we assume that $\mathcal{T} = [0:T)$ and $\mathcal{S} = [0:S)$, for integers $T \geq 2$ and $S \geq 2$. The elements in the image of $\mathrm{Dec}$ are called sourcewords.
from_dec_mapping
classmethod
Constructs a variable-to-fixed length code from the decoding mapping $\Dec$.
Parameters:
-
dec_mapping
(dict[Word, Word]
) –The decoding mapping $\Dec$. Must be a dictionary of length at most $S^n$ whose keys are $n$-tuples of integers in $[0:T)$ and whose values are distinct non-empty tuples of integers in $[0:S)$.
Notes
The target block size $n$ is inferred from the domain of the decoding mapping, and the target and source cardinalities $T$ and $S$ are inferred from the maximum values in the domain and co-domain, respectively.
Examples:
>>> code = komm.VariableToFixedCode.from_dec_mapping({
... (0, 0): (0, 0, 0),
... (0, 1): (0, 0, 1),
... (1, 0): (0, 1),
... (1, 1): (1,),
... })
>>> code.target_cardinality, code.source_cardinality, code.target_block_size
(2, 2, 2)
>>> code.sourcewords
[(0, 0, 0), (0, 0, 1), (0, 1), (1,)]
>>> code = komm.VariableToFixedCode.from_dec_mapping({
... (0, 0, 0): (1, ),
... (0, 0, 1): (2, ),
... (0, 1, 0): (0, 1),
... (0, 1, 1): (0, 2),
... (1, 0, 0): (0, 0, 0),
... (1, 0, 1): (0, 0, 1),
... (1, 1, 0): (0, 0, 2),
... }) # Incomplete mapping
>>> code.target_cardinality, code.source_cardinality, code.target_block_size
(2, 3, 3)
>>> code.sourcewords
[(1,), (2,), (0, 1), (0, 2), (0, 0, 0), (0, 0, 1), (0, 0, 2)]
from_sourcewords
classmethod
Constructs a variable-to-fixed length code from the target cardinality $T$ and a list of sourcewords.
Parameters:
-
target_cardinality
(int
) –The target cardinality $T$. Must be an integer greater than or equal to $2$.
-
sourcewords
(list[Word]
) –The sourcewords of the code. Must be a list of length at most $T^n$ containing tuples of integers in $[0:S)$, where $S$ is the source cardinality of the code. The tuple in position $i$ must be equal to $\mathrm{Dec}(v)$, where $v$ is the $i$-th element in the lexicographic ordering of $[0:T)^n$.
Note
The target block size $n$ is inferred from the length of the sourcewords, and the source cardinality $S$ is inferred from the maximum value in the sourcewords.
Examples:
>>> code = komm.VariableToFixedCode.from_sourcewords(
... target_cardinality=2,
... sourcewords=[(0,0,0), (0,0,1), (0,1), (1,)],
... )
>>> code.target_cardinality, code.source_cardinality, code.target_block_size
(2, 2, 2)
>>> code.dec_mapping
{(0, 0): (0, 0, 0),
(0, 1): (0, 0, 1),
(1, 0): (0, 1),
(1, 1): (1,)}
>>> code = komm.VariableToFixedCode.from_sourcewords(
... target_cardinality=2,
... sourcewords=[(1,), (2,), (0,1), (0,2), (0,0,0), (0,0,1), (0,0,2)],
... )
>>> code.target_cardinality, code.source_cardinality, code.target_block_size
(2, 3, 3)
>>> code.dec_mapping
{(0, 0, 0): (1,),
(0, 0, 1): (2,),
(0, 1, 0): (0, 1),
(0, 1, 1): (0, 2),
(1, 0, 0): (0, 0, 0),
(1, 0, 1): (0, 0, 1),
(1, 1, 0): (0, 0, 2)}
target_cardinality: int
property
The target cardinality $T$ of the code. It is the number of symbols in the target alphabet.
Examples:
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0,1)])
>>> code.target_cardinality
2
source_cardinality: int
property
The source cardinality $S$ of the code. It is the number of symbols in the source alphabet.
Examples:
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0,1)])
>>> code.source_cardinality
2
target_block_size: int
property
The target block size $n$ of the code. It is the number of symbols in each target block.
Examples:
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0,1)])
>>> code.target_block_size
2
size: int
property
The number of sourcewords in the code. It is less than or equal to $T^n$.
Examples:
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0,1)])
>>> code.size
3
dec_mapping: dict[Word, Word]
property
The decoding mapping $\mathrm{Dec}$ of the code. It is a dictionary of length at most $T^n$ whose keys are $n$-tuples of integers in $[0:T)$ and whose values are the corresponding sourcewords.
Examples:
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0,1)])
>>> code.dec_mapping
{(0, 0): (0,), (0, 1): (1,), (1, 0): (0, 1)}
inv_dec_mapping: dict[Word, Word]
property
The inverse decoding mapping $\mathrm{Dec}^{-1}$ of the code. It is a dictionary of length at most $T^n$ whose keys are all the sourcewords of the code, and whose values are the corresponding target words.
Examples:
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0,1)])
>>> code.inv_dec_mapping
{(0,): (0, 0), (1,): (0, 1), (0, 1): (1, 0)}
sourcewords: list[Word]
property
The sourcewords of the code. They correspond to the image of the decoding mapping $\mathrm{Dec}$.
Examples:
>>> code = komm.VariableToFixedCode.from_dec_mapping({
... (0, 0): (0,),
... (0, 1): (1,),
... (1, 0): (0, 1),
... })
>>> code.sourcewords
[(0,), (1,), (0, 1)]
is_fully_covering
Returns whether the code is fully covering. A code is fully covering if every possible source sequence has a prefix that is a sourceword.
Examples:
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,0), (1,1)])
>>> code.is_fully_covering()
True
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0,1)])
>>> code.is_fully_covering()
True
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,0), (0,1)])
>>> code.is_fully_covering() # (1,) is not covered
False
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (0,1)])
>>> code.is_fully_covering() # (1,) is not covered
False
is_uniquely_encodable
Returns whether the code is uniquely encodable. A code is uniquely encodable if there is a unique way to parse any concatenation of sourcewords.
Examples:
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,0), (1,1)])
>>> code.is_uniquely_encodable()
True
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0,1)])
>>> code.is_uniquely_encodable() # 01 can be parsed as 0|1 or 01
False
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,0), (0,1)])
>>> code.is_uniquely_encodable()
True
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (0,1)])
>>> code.is_uniquely_encodable()
True
is_prefix_free
Returns whether the code is prefix-free. A code is prefix-free if no sourceword is a prefix of any other sourceword.
Examples:
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,0), (1,1)])
>>> code.is_prefix_free()
True
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0,1)])
>>> code.is_prefix_free()
False
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,0), (0,1)])
>>> code.is_prefix_free()
True
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (0,1)])
>>> code.is_prefix_free()
False
rate
Computes the expected rate $R$ of the code, considering a given pmf. This quantity is given by $$ R = \frac{n}{\bar{k}}, $$ where $n$ is the target block size, and $\bar{k}$ is the expected sourceword length, assuming iid source symbols drawn from $p_X$. It is measured in $T$-ary digits per source symbol.
Parameters:
-
pmf
(Array1D[float]
) –The (first-order) probability mass function $p_X$ to be considered.
Returns:
-
rate
(float
) –The expected rate $R$ of the code.
Examples:
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0,1)])
>>> code.rate([2/3, 1/3])
np.float64(1.3846153846153846)
encode
Encodes a sequence of source symbols using the code, which must be fully covering and uniquely encodable. When the input sequence ends with symbols that form only a partial match with any sourceword, the encoder will complete this last block using any valid sourceword that starts with these remaining symbols.
Warning
Encoding for non-prefix-free codes is not implemented yet.
Parameters:
-
input
(ArrayLike
) –The sequence of symbols to be encoded. Must be a 1D-array with elements in $[0:S)$ (where $S$ is the source cardinality of the code).
Returns:
-
output
(NDArray[integer]
) –The sequence of encoded symbols. It is a 1D-array with elements in $[0:T)$ (where $T$ is the target cardinality of the code) with a length that is a multiple of the target block size $n$.
Examples:
>>> code = komm.VariableToFixedCode.from_dec_mapping({
... (0, 0): (0, 0, 0),
... (0, 1): (0, 0, 1),
... (1, 0): (0, 1),
... (1, 1): (1,),
... })
>>> code.encode([1, 0, 0, 0]) # Parsed as 1|000
array([1, 1, 0, 0])
>>> code.encode([1, 0, 0]) # Incomplete input, completed as 1|000
array([1, 1, 0, 0])
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,0), (0,1)])
>>> code.encode([1, 0, 0, 0]) # Code is not fully covering
Traceback (most recent call last):
...
ValueError: code is not fully covering
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0,1)])
>>> code.encode([1, 0, 0, 0]) # Code is not uniquely encodable
Traceback (most recent call last):
...
ValueError: code is not uniquely encodable
decode
Decodes a sequence of target symbols using the code, which must be fully covering and uniquely encodable.
Parameters:
-
input
(ArrayLike
) –The sequence of symbols to be decoded. Must be a 1D-array with elements in $[0:T)$ (where $T$ is the target cardinality of the code) and have a length that is a multiple of the target block size $n$. Also, the sequence must be a concatenation of target words (i.e., the output of the
encode
method).
Returns:
-
NDArray[integer]
–The sequence of decoded symbols. It is a 1D-array with elements in $[0:S)$ (where $S$ is the source cardinality of the code).
Examples:
>>> code = komm.VariableToFixedCode.from_dec_mapping({
... (0, 0): (0,),
... (0, 1): (1,),
... (1, 0): (2,),
... })
>>> code.decode([0, 0, 1, 0])
array([0, 2])
>>> code.decode([1, 1, 0, 0, 1]) # Not a multiple of target block size
Traceback (most recent call last):
...
ValueError: length of input must be a multiple of block size 2 (got 5)
>>> code.decode([0, 0, 1, 1]) # 11 is not a valid target word
Traceback (most recent call last):
...
ValueError: input contains invalid word
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,0), (0,1)])
>>> code.decode([0, 0, 0, 1]) # Code is not fully covering
Traceback (most recent call last):
...
ValueError: code is not fully covering
>>> code = komm.VariableToFixedCode.from_sourcewords(2, [(0,), (1,), (0,1)])
>>> code.decode([0, 0, 0, 1]) # Code is not uniquely encodable
Traceback (most recent call last):
...
ValueError: code is not uniquely encodable