6.2. Activation quantization

Blueoil can quantize an activation function by passing the callable activation quantizer and keyword arguments activation_quantizer_kwargs to the network class.

        self.block_last = conv2d("block_last", self.block_1, filters=self.num_classes, kernel_size=1,
                                 activation=None, use_bias=True, is_debug=self.is_debug,
                                 kernel_initializer=tf.compat.v1.random_normal_initializer(mean=0.0, stddev=0.01),
                                 data_format=channel_data_format)

        h = self.block_last.get_shape()[1].value
        w = self.block_last.get_shape()[2].value
        self.pool = tf.compat.v1.layers.average_pooling2d(name='global_average_pool', inputs=self.block_last,
                                                          pool_size=[h, w], padding='VALID', strides=1,
                                                          data_format=channel_data_format)
        self.base_output = tf.reshape(self.pool, [-1, self.num_classes], name="pool_reshape")

        return self.base_output

            weight_quantizer=None,
            weight_quantizer_kwargs={},
            *args,
            **kwargs
    ):
        """
        Args:
            quantize_first_convolution(bool): use quantization in first conv.
            quantize_last_convolution(bool): use quantization in last conv.
            weight_quantizer (callable): weight quantizer.

6.2.1. Activation quantizer

Currenly, Blueoil has only one activation function quantizer.

6.2.1.1. Linear mid tread half quantizer (LinearMidTreadHalfQuantizer)

This quantization creates a linear mid tread half quantizer. If backward is provided, this backward will be used in backpropagation.

This quantization method is DoReFa-Net 1 activation quantization variant, the differencce from DoReFa-Net is to be able to change max_value.

Forward is:

\[\begin{split}\mathbf{X} & = \text{clip}\big(\mathbf{X}, 0, max\_value\big)\\ \mathbf{Y} & = \begin{cases} \mathbf{X}, & \text{if $bit$ is 32} \\ \frac{\text{round}\big(\frac{\mathbf{X}}{max\_value} \cdot (2^{bit}-1)\big)}{2^{bit}-1} \cdot max\_value, & otherwise \end{cases}\end{split}\]

Default backward is:

\[\begin{split}\frac{\partial Loss}{\partial \mathbf{X}} = \begin{cases} \frac{\partial Loss}{\partial y}, & \text{if $0 < x < max\_value$}\\ 0, & otherwise \end{cases}\end{split}\]

Reference