6.2. Activation quantization

Blueoil can quantize an activation function by passing the callable activation quantizer and keyword arguments activation_quantizer_kwargs to the network class.

class SampleNetworkQuantize(SampleNetwork):
    """Quantize Sample Network."""

    def __init__(
            self,
            quantize_first_convolution=True,
            quantize_last_convolution=True,
            activation_quantizer=None,
            activation_quantizer_kwargs={},
            weight_quantizer=None,
            weight_quantizer_kwargs={},
            *args,
            **kwargs
    ):
        super().__init__(*args, **kwargs)

        self.quantize_first_convolution = quantize_first_convolution
        self.quantize_last_convolution = quantize_last_convolution

        assert callable(weight_quantizer)
        assert callable(activation_quantizer)

        self.weight_quantization = weight_quantizer(**weight_quantizer_kwargs)
        self.activation = activation_quantizer(**activation_quantizer_kwargs)

6.2.1. Activation quantizer

Currenly, Blueoil has only one activation function quantizer.

6.2.1.1. Linear mid tread half quantizer (LinearMidTreadHalfQuantizer)

This quantization creates a linear mid tread half quantizer. If backward is provided, this backward will be used in backpropagation.

This quantization method is DoReFa-Net 1 activation quantization variant, the differencce from DoReFa-Net is to be able to change max_value.

Forward is:

\[\begin{split}\mathbf{X} & = \text{clip}\big(\mathbf{X}, 0, max\_value\big)\\ \mathbf{Y} & = \begin{cases} \mathbf{X}, & \text{if $bit$ is 32} \\ \frac{\text{round}\big(\frac{\mathbf{X}}{max\_value} \cdot (2^{bit}-1)\big)}{2^{bit}-1} \cdot max\_value, & otherwise \end{cases}\end{split}\]

Default backward is:

\[\begin{split}\frac{\partial Loss}{\partial \mathbf{X}} = \begin{cases} \frac{\partial Loss}{\partial y}, & \text{if $0 < x < max\_value$}\\ 0, & otherwise \end{cases}\end{split}\]

Reference