3-5 Evaggelikis Scholis, 14231 Nea Ionia, Greece

Developing a Python Program Using Inspection Tools

Originally posted on machinelearningmastery.

Python is an interpreting language. It means there is an interpreter to run our program, rather than compiling the code and running natively. In Python, a REPL (read-eval-print loop) can run commands line by line. Together with some inspection tools provided by Python, it helps to develop codes.

In the following, you will see how to make use of the Python interpreter to inspect an object and develop a program.

After finishing this tutorial, you will learn:

  • How to work in the Python interpreter
  • How to use the inspection functions in Python
  • How to develop a solution step by step with the help of inspection functions

Let’s get started!

Tutorial Overview

This tutorial is in four parts; they are:

  • PyTorch and TensorFlow
  • Looking for Clues
  • Learning from the Weights
  • Making a Copier

PyTorch and TensorFlow

PyTorch and TensorFlow are the two biggest neural network libraries in Python. Their code is different, but the things they can do are similar.

Consider the classic MNIST handwritten digit recognition problem; you can build a LeNet-5 model to classify the digits as follows:

This is a simplified code that does not need any validation or testing. The counterpart in TensorFlow is the following:

Running this program would give you the file lenet5.pt from the PyTorch code and lenet5.h5 from the TensorFlow code.

Looking for Clues

If you understand what the above neural networks are doing, you should be able to tell that there is nothing but many multiply and add calculations in each layer. Mathematically, there is a matrix multiplication between the input and the kernel of each fully-connected layer before adding the bias to the result. In the convolutional layers, there is the element-wise multiplication of the kernel to a portion of the input matrix before taking the sum of the result and adding the bias as one output element of the feature map.

While developing the same LeNet-5 model using two different frameworks, it should be possible to make them work identically if their weights are the same. How can you copy over the weight from one model to another, given their architectures are identical?

You can load the saved models as follows:

This probably does not tell you much. But if you run python in the command line without any parameters, you launch the REPL, in which you can type in the above code (you can leave the REPL with quit()):

Nothing shall be printed in the above. But you can check the two models that were loaded using the type() built-in command:

So here you know they are neural network models from PyTorch and Keras, respectively. Since they are trained models, the weight must be stored inside. So how can you find the weights in these models? Since they are objects, the easiest way is to use dir() built-in function to inspect their members:

There are a lot of members in each object. Some are attributes, and some are methods of the class. By convention, those that begin with an underscore are internal members that you are not supposed to access in normal circumstances. If you want to see more of each member, you can use the getmembers() function from the inspect module:

The output of the getmembers() function is a list of tuples, in which each tuple is the name of the member and the member itself. From the above, for example, you know that __call__ is a “bound method,” i.e., a member method of a class.

By carefully looking at the members’ names, you can see that in the PyTorch model, the “state” should be your interest, while in the Keras model, you have some member with the name “weights.” To shortlist the names of them, you can do the following in the interpreter:

This might take some time in trial and error. But it’s not too difficult, and you may discover that you can see the weight with state_dict in the torch model:

For the TensorFlow/Keras model, you can find the weights with get_weights():

Here it is also with the attribute weights:

Here,  you can observe the following: In the PyTorch model, the function state_dict() gives an OrderedDict, which is a dictionary with the key in a specified order. There are keys such as 0.weight, and they are mapped to a tensor value. In the Keras model, the get_weights() function returns a list. Each element in the list is a NumPy array. The weight attribute also holds a list, but the elements are tf.Variable type.

You can know more by checking the shape of each tensor or array:

While you do not see the name of the layers from the Keras model above, in fact, you can use similar reasoning to find the layers and get their name:


Related Posts