Keras intermediate layer (attention model) output

Question

I have a model with this summary:

___________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_1 (InputLayer)             (None, 30, 37)        0                                            
____________________________________________________________________________________________________
s0 (InputLayer)                  (None, 128)           0                                            
____________________________________________________________________________________________________
bidirectional_1 (Bidirectional)  (None, 30, 128)       52224       input_1[0][0]                    
____________________________________________________________________________________________________
repeat_vector_1 (RepeatVector)   (None, 30, 128)       0           s0[0][0]                         
                                                                   lstm_1[0][0]                     
                                                                   lstm_1[1][0]                     
                                                                   lstm_1[2][0]                     
                                                                   lstm_1[3][0]                     
                                                                   lstm_1[4][0]                     
                                                                   lstm_1[5][0]                     
                                                                   lstm_1[6][0]                     
                                                                   lstm_1[7][0]                     
                                                                   lstm_1[8][0]                     
____________________________________________________________________________________________________
concatenate_1 (Concatenate)      (None, 30, 256)       0           bidirectional_1[0][0]            
                                                                   repeat_vector_1[0][0]            
                                                                   bidirectional_1[0][0]            
                                                                   repeat_vector_1[1][0]            
                                                                   bidirectional_1[0][0]            
                                                                   repeat_vector_1[2][0]            
                                                                   bidirectional_1[0][0]            
                                                                   repeat_vector_1[3][0]            
                                                                   bidirectional_1[0][0]            
                                                                   repeat_vector_1[4][0]            
                                                                   bidirectional_1[0][0]            
                                                                   repeat_vector_1[5][0]            
                                                                   bidirectional_1[0][0]            
                                                                   repeat_vector_1[6][0]            
                                                                   bidirectional_1[0][0]            
                                                                   repeat_vector_1[7][0]            
                                                                   bidirectional_1[0][0]            
                                                                   repeat_vector_1[8][0]            
                                                                   bidirectional_1[0][0]            
                                                                   repeat_vector_1[9][0]            
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 30, 1)         257         concatenate_1[0][0]              
                                                                   concatenate_1[1][0]              
                                                                   concatenate_1[2][0]              
                                                                   concatenate_1[3][0]              
                                                                   concatenate_1[4][0]              
                                                                   concatenate_1[5][0]              
                                                                   concatenate_1[6][0]              
                                                                   concatenate_1[7][0]              
                                                                   concatenate_1[8][0]              
                                                                   concatenate_1[9][0]              
____________________________________________________________________________________________________
attention_weights (Activation)   (None, 30, 1)         0           dense_1[0][0]                    
                                                                   dense_1[1][0]                    
                                                                   dense_1[2][0]                    
                                                                   dense_1[3][0]                    
                                                                   dense_1[4][0]                    
                                                                   dense_1[5][0]                    
                                                                   dense_1[6][0]                    
                                                                   dense_1[7][0]                    
                                                                   dense_1[8][0]                    
                                                                   dense_1[9][0]                    
____________________________________________________________________________________________________
dot_1 (Dot)                      (None, 1, 128)        0           attention_weights[0][0]          
                                                                   bidirectional_1[0][0]            
                                                                   attention_weights[1][0]          
                                                                   bidirectional_1[0][0]            
                                                                   attention_weights[2][0]          
                                                                   bidirectional_1[0][0]            
                                                                   attention_weights[3][0]          
                                                                   bidirectional_1[0][0]            
                                                                   attention_weights[4][0]          
                                                                   bidirectional_1[0][0]            
                                                                   attention_weights[5][0]          
                                                                   bidirectional_1[0][0]            
                                                                   attention_weights[6][0]          
                                                                   bidirectional_1[0][0]            
                                                                   attention_weights[7][0]          
                                                                   bidirectional_1[0][0]            
                                                                   attention_weights[8][0]          
                                                                   bidirectional_1[0][0]            
                                                                   attention_weights[9][0]          
                                                                   bidirectional_1[0][0]            
____________________________________________________________________________________________________
c0 (InputLayer)                  (None, 128)           0                                            
____________________________________________________________________________________________________
lstm_1 (LSTM)                    [(None, 128), (None,  131584      dot_1[0][0]                      
                                                                   s0[0][0]                         
                                                                   c0[0][0]                         
                                                                   dot_1[1][0]                      
                                                                   lstm_1[0][0]                     
                                                                   lstm_1[0][2]                     
                                                                   dot_1[2][0]                      
                                                                   lstm_1[1][0]                     
                                                                   lstm_1[1][2]                     
                                                                   dot_1[3][0]                      
                                                                   lstm_1[2][0]                     
                                                                   lstm_1[2][2]                     
                                                                   dot_1[4][0]                      
                                                                   lstm_1[3][0]                     
                                                                   lstm_1[3][2]                     
                                                                   dot_1[5][0]                      
                                                                   lstm_1[4][0]                     
                                                                   lstm_1[4][2]                     
                                                                   dot_1[6][0]                      
                                                                   lstm_1[5][0]                     
                                                                   lstm_1[5][2]                     
                                                                   dot_1[7][0]                      
                                                                   lstm_1[6][0]                     
                                                                   lstm_1[6][2]                     
                                                                   dot_1[8][0]                      
                                                                   lstm_1[7][0]                     
                                                                   lstm_1[7][2]                     
                                                                   dot_1[9][0]                      
                                                                   lstm_1[8][0]                     
                                                                   lstm_1[8][2]                     
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 11)            1419        lstm_1[0][0]                     
                                                                   lstm_1[1][0]                     
                                                                   lstm_1[2][0]                     
                                                                   lstm_1[3][0]                     
                                                                   lstm_1[4][0]                     
                                                                   lstm_1[5][0]                     
                                                                   lstm_1[6][0]                     
                                                                   lstm_1[7][0]                     
                                                                   lstm_1[8][0]                     
                                                                   lstm_1[9][0]                     
====================================================================================================
Total params: 185,484
Trainable params: 185,484
Non-trainable params: 0
____________________________________________________________________________________________________

The model is further summarised as:

And the "attention" block summarised as:

The input is a fuzzy date, e.g. "November 17, 1979" (capped at 30 characters) and the output is the 10 character representation "YYYY-mm-dd".

I would like to plot the values of the attention_weights layer.

I would like to see which part of "Saturday, 17th November, 1979" the network "looks at" when it predicts each of YYYY, mm, and dd. I'm expecting to see it ignores the day ("Saturday") completely.

I've tried following the Keras documentation for obtaining the output of an intermediate layer.

However, the attention node has 10 inputs, so I have to grab each of those:

f = K.function(model.inputs, [model.get_layer('attention_weights').get_output_at(t) for t in range(10)])
r = f([source, np.zeros((1,128)), np.zeros((1,128))])

With source e.g. "17 November 1979" encoded as

[[[ 0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.
    0.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    1.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    1.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    1.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    1.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    1.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    1.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    1.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    1.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    1.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    1.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    1.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    1.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    1.]
  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
    1.]]]

r is then a matrix of shape (10,1,30,1) and the attention map I'm plotting it thus:

attention_map = np.zeros((10, 30))
for t in range(10):
    for t_prime in range(30):
        attention_map[t][t_prime] = r[t][0,t_prime,0]

...but all the values are the same! I'm expecting some variation.

I've also tried adding K.learning_phase() to no avail. What am I doing wrong?

opyate · Answer

The problem was that I tried to plot the attention map of a model which was loaded from a saved model.

The output when the model was saved was:

/home/opyate/anaconda3/lib/python3.6/site-packages/keras/engine/topology.py:2361:
  UserWarning: Layer lstm_1 was passed non-serializable keyword
  arguments: {'initial_state': [, ]}.
  They will not be included in the serialized model (and thus will be
  missing at deserialization time).   str(node.arguments) + '. They will
  not be included '
  /home/opyate/anaconda3/lib/python3.6/site-packages/keras/engine/topology.py:2361:
  UserWarning: Layer lstm_1 was passed non-serializable keyword
  arguments: {'initial_state': [, ]}. They will not be included in the
  serialized model (and thus will be missing at deserialization time).
  str(node.arguments) + '. They will not be included '
  /home/opyate/anaconda3/lib/python3.6/site-packages/keras/engine/topology.py:2361:
  UserWarning: Layer lstm_1 was passed non-serializable keyword
  arguments: {'initial_state': [,
  ]}.
  They will not be included in the serialized model (and thus will be
  missing at deserialization time).   str(node.arguments) + '. They will
  not be included '
  /home/opyate/anaconda3/lib/python3.6/site-packages/keras/engine/topology.py:2361:
  UserWarning: Layer lstm_1 was passed non-serializable keyword
  arguments: {'initial_state': [,
  ]}.
  They will not be included in the serialized model (and thus will be
  missing at deserialization time).   str(node.arguments) + '. They will
  not be included '
  /home/opyate/anaconda3/lib/python3.6/site-packages/keras/engine/topology.py:2361:
  UserWarning: Layer lstm_1 was passed non-serializable keyword
  arguments: {'initial_state': [,
  ]}.
  They will not be included in the serialized model (and thus will be
  missing at deserialization time).   str(node.arguments) + '. They will
  not be included '
  /home/opyate/anaconda3/lib/python3.6/site-packages/keras/engine/topology.py:2361:
  UserWarning: Layer lstm_1 was passed non-serializable keyword
  arguments: {'initial_state': [,
  ]}.
  They will not be included in the serialized model (and thus will be
  missing at deserialization time).   str(node.arguments) + '. They will
  not be included '
  /home/opyate/anaconda3/lib/python3.6/site-packages/keras/engine/topology.py:2361:
  UserWarning: Layer lstm_1 was passed non-serializable keyword
  arguments: {'initial_state': [,
  ]}.
  They will not be included in the serialized model (and thus will be
  missing at deserialization time).   str(node.arguments) + '. They will
  not be included '
  /home/opyate/anaconda3/lib/python3.6/site-packages/keras/engine/topology.py:2361:
  UserWarning: Layer lstm_1 was passed non-serializable keyword
  arguments: {'initial_state': [,
  ]}.
  They will not be included in the serialized model (and thus will be
  missing at deserialization time).   str(node.arguments) + '. They will
  not be included '

However, if I construct the model from code, and just load the saved weights, it works.

The assumption is that the UserWarnings when saving the model has something to do with my problem.

Keras intermediate layer (attention model) output

One Answer

Add your own answers!

Ask a Question