How to merge two CNN deep learning model using weighted sum and weighted product in Keras?

Question

I am using Keras to create a deep learning model and I want to merge two CNNs by using weighted sum or weighted product.

How can I merge two CNNs using weighted sum and weighted product?

Djib2011 · Accepted Answer

I think the most elegant way is to write a layer that does this. For example for the case of the weighted sum:

class WeightedSum(Layer):

def __init__(self, a, **kwargs):
        self.a = a  # "weight" of the weighted sum
        super(WeightedSum, self).__init__(**kwargs)

def call(self, model_outputs):
        return self.a * model_outputs[0] + (1 - self.a) * model_outputs[1]

def compute_output_shape(self, input_shape):
        return input_shape[0]

Suppose you have two models model1 and model2 with outputs out1 and out2 respectively. This layer simply performs the operation:

$$
out = a cdot out_1 + (1-a) cdot out_2
$$

You can compute a weighted product the same way, just change the call method.

Example

from keras.layers import Layer, Input, Dense
from keras.models import Model
import keras.backend as K
import tensorflow as tf

# Define the custom layer
class WeightedSum(Layer):
    def __init__(self, a, **kwargs):
        self.a = a
        super(WeightedSum, self).__init__(**kwargs)
    def call(self, model_outputs):
        return self.a * model_outputs[0] + (1 - self.a) * model_outputs[1]
    def compute_output_shape(self, input_shape):
        return input_shape[0]

# Create model1
inp1 = Input((5,))
d1 = Dense(100)(inp1)
out1 = Dense(10)(d1)
model1 = Model(inp1, out1)

# Create model2
inp2 = Input((7,))
d2 = Dense(70)(inp2)
out2 = Dense(10)(d2)
model2 = Model(inp2, out2)

# Weighed sum of the two models' outputs with a = 0.1
out = WeightedSum(0.1)([model1.output, model2.output])

# Create the merged model
model = Model(inputs=[inp1, inp2], outputs=[out])

Let's check the summary:

>>> model.summary()
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_5 (InputLayer)            (None, 5)            0
__________________________________________________________________________________________________
input_6 (InputLayer)            (None, 7)            0
__________________________________________________________________________________________________
dense_9 (Dense)                 (None, 100)          600         input_5[0][0]
__________________________________________________________________________________________________
dense_11 (Dense)                (None, 70)           560         input_6[0][0]
__________________________________________________________________________________________________
dense_10 (Dense)                (None, 10)           1010        dense_9[0][0]
__________________________________________________________________________________________________
dense_12 (Dense)                (None, 10)           710         dense_11[0][0]
__________________________________________________________________________________________________
weighted_sum_10 (WeightedSum)   (None, 10)           0           dense_10[0][0]
                                                                 dense_12[0][0]
==================================================================================================
Total params: 2,880
Trainable params: 2,880
Non-trainable params: 0
__________________________________________________________________________________________________

Let's see if it works:

import numpy as np

a = np.random.random(size=(32, 5))  # input for model1 (batch size 32)
b = np.random.random(size=(32, 7))  # input for model2 (batch size 32)

pred = model.predict([a, b])

Let's see if it has the right shape:

>>> model.shape
(32, 10)

Let's see if it's the correct thing:

# Generate model outputs manually:
o1 = model1.predict(a)  # model1's output for array a
o2 = model2.predict(b)  # model2's output for array b

# Compute their weighted sum manually:
o = 0.1 * o1 + 0.9 * o2

Now if we're correct, o should be equal to pred:

>>> np.array_equal(o, pred)
True

bw4sz · Answer

Following up on my comment since I think it will be useful to anyone coming here. "a" can be trainable weight in tf.keras
class WeightedSum(layers.Layer):
    """A custom keras layer to learn a weighted sum of tensors"""

def __init__(self, **kwargs):
        super(WeightedSum, self).__init__(**kwargs)

def build(self, input_shape=1):
        self.a = self.add_weight(
            name='alpha',
            shape=(),
            initializer='ones',
            dtype='float32',
            trainable=True,
        )
        super(WeightedSum, self).build(input_shape)

def call(self, model_outputs):
        return self.a * model_outputs[0] + (1 - self.a) * model_outputs[1]

def compute_output_shape(self, input_shape):
        return input_shape[0]

it may also be advisable to constrain alpha to be bounded [0,1].
constraint=tf.keras.constraints.min_max_norm(max_value=1,min_value=0)

How to merge two CNN deep learning model using weighted sum and weighted product in Keras?

2 Answers

Example

Add your own answers!

Ask a Question