6.1. Weight quantization

Blueoil can quantize weight in the network by passing the callable weight quantizer and keyword arguments weight_quantizer_kwargs to the network class.

class SampleNetworkQuantize(SampleNetwork):
    """Quantize Sample Network."""

    def __init__(
            self,
            quantize_first_convolution=True,
            quantize_last_convolution=True,
            activation_quantizer=None,
            activation_quantizer_kwargs={},
            weight_quantizer=None,
            weight_quantizer_kwargs={},
            *args,
            **kwargs
    ):
        super().__init__(*args, **kwargs)

        self.quantize_first_convolution = quantize_first_convolution
        self.quantize_last_convolution = quantize_last_convolution

        assert callable(weight_quantizer)
        assert callable(activation_quantizer)

        self.weight_quantization = weight_quantizer(**weight_quantizer_kwargs)
        self.activation = activation_quantizer(**activation_quantizer_kwargs)

6.1.1. Tensorflow custom getter

The main idea of selecting variable to do weight quantize in Blueoil is from the custom_getter in variable_scope namely _quantized_variable_getter.

    def _quantized_variable_getter(
            weight_quantization,
            quantize_first_convolution,
            quantize_last_convolution,
            getter,
            name,
            *args,
            **kwargs):
        assert callable(weight_quantization)
        var = getter(name, *args, **kwargs)
        with tf.compat.v1.variable_scope(name):
            if "kernel" == var.op.name.split("/")[-1]:

                if not quantize_first_convolution:
                    if var.op.name.startswith("block_1/"):
                        return var

                if not quantize_last_convolution:
                    if var.op.name.startswith("block_last/"):
                        return var

                # Apply weight quantize to variable whose last word of name is "kernel".
                quantized_kernel = weight_quantization(var)
                tf.compat.v1.summary.histogram("quantized_kernel", quantized_kernel)
                return quantized_kernel

        return var

    def base(self, images, is_training):
        custom_getter = partial(
            self._quantized_variable_getter,
            self.weight_quantization,
            self.quantize_first_convolution,
            self.quantize_last_convolution,
        )
        with tf.compat.v1.variable_scope("", custom_getter=custom_getter):
            return super().base(images, is_training)

The selection criteria is based on these three variables.

  • name: This is an argument for tf.compat.v1.variable_scope(name) which indicate the raw_ops in this layer.

  • quantize_first_convolution: boolean indicate quantization on the first convolution layer

  • quantize_last_convolution: boolean indicate quantization on the last convolution layer

The variable which variable scope name ending with kernel will be weight quantized, except it is a first or last layer with quantize_first_convolution or quantize_last_convolution set as False respectively.

6.1.2. Weight quantizer

Selection of weight quantizer are Binary channel wise mean scaling quantizer and Binary mean scaling quantizer:

6.1.2.1. Binary channel wise mean scaling quantizer (BinaryChannelWiseMeanScalingQuantizer)

This quantization creates a binary channel wise mean scaling quantizer. If backward is provided, this backward will be used in backpropagation.

This method is varient of XNOR-Net 1 weight quantization, the differencce from XNOR-Net is backward function.

Forward is:

\[\begin{split}\begin{align} \bar{\mathbf{x}} & = \frac{1}{n}||\mathbf{X}||_{\ell1} & \text{$\bar{\mathbf{x}}$ is a $c$-channels vector} \\ & & \text{$n$ is number of elements in each channel of $\mathbf{X}$} \\\\ \mathbf{Y} & = \text{sign}\big(\mathbf{X}\big) \times \bar{\mathbf{x}} &\\ \end{align}\end{split}\]

Default backward is:

\[\frac{\partial Loss}{\partial \mathbf{X}} = \frac{\partial Loss}{\partial \mathbf{Y}}\]

6.1.2.2. Binary mean scaling quantizer (BinaryMeanScalingQuantizer)

This quantization creates a binary mean scaling quantizer. If backward is provided, this backward will be used in backpropagation.

This method is DoReFa-Net 2 weight quantization.

Forward is:

\[\begin{split}\begin{align} \bar{x} & = \frac{1}{N}||\mathbf{X}||_{\ell1} & \text{$\bar{x}$ is a scalar} \\ & & \text{$N$ is number of elements in all channels of $\mathbf{X}$}\\ \mathbf{Y} & = \text{sign}\big(\mathbf{X}\big) \cdot \bar{x} &\\ \end{align}\end{split}\]

Default backward is:

\[\frac{\partial Loss}{\partial \mathbf{X}} = \frac{\partial Loss}{\partial \mathbf{Y}}\]

Reference