6.1. Weight quantization

Blueoil can quantize weight in the network by passing the callable weight quantizer and keyword arguments weight_quantizer_kwargs to the network class.

        self.block_last = conv2d("block_last", self.block_1, filters=self.num_classes, kernel_size=1,
                                 activation=None, use_bias=True, is_debug=self.is_debug,
                                 kernel_initializer=tf.compat.v1.random_normal_initializer(mean=0.0, stddev=0.01),
                                 data_format=channel_data_format)

        h = self.block_last.get_shape()[1].value
        w = self.block_last.get_shape()[2].value
        self.pool = tf.compat.v1.layers.average_pooling2d(name='global_average_pool', inputs=self.block_last,
                                                          pool_size=[h, w], padding='VALID', strides=1,
                                                          data_format=channel_data_format)
        self.base_output = tf.reshape(self.pool, [-1, self.num_classes], name="pool_reshape")

        return self.base_output

            weight_quantizer=None,
            weight_quantizer_kwargs={},
            *args,
            **kwargs
    ):
        """
        Args:
            quantize_first_convolution(bool): use quantization in first conv.
            quantize_last_convolution(bool): use quantization in last conv.
            weight_quantizer (callable): weight quantizer.

6.1.1. Tensorflow custom getter

The main idea of selecting variable to do weight quantize in Blueoil is from the custom_getter in variable_scope namely _quantized_variable_getter.

            activation_quantizer_kwargs(dict): Initialize kwargs for activation quantizer.
        """

        super().__init__(*args, **kwargs)

        self.quantize_first_convolution = quantize_first_convolution
        self.quantize_last_convolution = quantize_last_convolution

            *args,
            **kwargs):
        """Get the quantized variables.

        Use if to choose or skip the target should be quantized.

        Args:
            weight_quantization: Callable object which quantize variable.
            quantize_first_convolution(bool): Use quantization in first conv.
            quantize_last_convolution(bool): Use quantization in last conv.
            getter: Default from tensorflow.
            name: Default from tensorflow.
            args: Args.
            kwargs: Kwargs.
        """
        assert callable(weight_quantization)
        var = getter(name, *args, **kwargs)
        with tf.compat.v1.variable_scope(name):
            if "kernel" == var.op.name.split("/")[-1]:

                if not quantize_first_convolution:
                    if var.op.name.startswith("block_1/"):
                        return var

                if not quantize_last_convolution:
                    if var.op.name.startswith("block_last/"):
                        return var

                # Apply weight quantize to variable whose last word of name is "kernel".
                quantized_kernel = weight_quantization(var)
                tf.compat.v1.summary.histogram("quantized_kernel", quantized_kernel)
                return quantized_kernel

        return var

    def base(self, images, is_training):
        custom_getter = partial(
            self._quantized_variable_getter,
            self.weight_quantization,
            self.quantize_first_convolution,
            self.quantize_last_convolution,
        )
        with tf.compat.v1.variable_scope("", custom_getter=custom_getter):
            return super().base(images, is_training)

The selection criteria is based on these three variables.

  • name: This is an argument for tf.compat.v1.variable_scope(name) which indicate the raw_ops in this layer.

  • quantize_first_convolution: boolean indicate quantization on the first convolution layer

  • quantize_last_convolution: boolean indicate quantization on the last convolution layer

The variable which variable scope name ending with kernel will be weight quantized, except it is a first or last layer with quantize_first_convolution or quantize_last_convolution set as False respectively.

6.1.2. Weight quantizer

Selection of weight quantizer are Binary channel wise mean scaling quantizer and Binary mean scaling quantizer:

6.1.2.1. Binary channel wise mean scaling quantizer (BinaryChannelWiseMeanScalingQuantizer)

This quantization creates a binary channel wise mean scaling quantizer. If backward is provided, this backward will be used in backpropagation.

This method is varient of XNOR-Net 1 weight quantization, the differencce from XNOR-Net is backward function.

Forward is:

\[\begin{split}\begin{align} \bar{\mathbf{x}} & = \frac{1}{n}||\mathbf{X}||_{\ell1} & \text{$\bar{\mathbf{x}}$ is a $c$-channels vector} \\ & & \text{$n$ is number of elements in each channel of $\mathbf{X}$} \\\\ \mathbf{Y} & = \text{sign}\big(\mathbf{X}\big) \times \bar{\mathbf{x}} &\\ \end{align}\end{split}\]

Default backward is:

\[\frac{\partial Loss}{\partial \mathbf{X}} = \frac{\partial Loss}{\partial \mathbf{Y}}\]

6.1.2.2. Binary mean scaling quantizer (BinaryMeanScalingQuantizer)

This quantization creates a binary mean scaling quantizer. If backward is provided, this backward will be used in backpropagation.

This method is DoReFa-Net 2 weight quantization.

Forward is:

\[\begin{split}\begin{align} \bar{x} & = \frac{1}{N}||\mathbf{X}||_{\ell1} & \text{$\bar{x}$ is a scalar} \\ & & \text{$N$ is number of elements in all channels of $\mathbf{X}$}\\ \mathbf{Y} & = \text{sign}\big(\mathbf{X}\big) \cdot \bar{x} &\\ \end{align}\end{split}\]

Default backward is:

\[\frac{\partial Loss}{\partial \mathbf{X}} = \frac{\partial Loss}{\partial \mathbf{Y}}\]

Reference