Data Science Asked on April 15, 2021
I’m working on a sequence to sequence approach using LSTM and a VAE with an attention mechanism.
p = np.random.permutation(len(input_data))
input_data = input_data[p]
teacher_data = teacher_data[p]
target_data = target_data[p]
BUFFER_SIZE = len(input_data)
BATCH_SIZE = 64
embedding_dim = 100
units = 256
vocab_in_size = len(input_lang.word2idx)
vocab_out_size = len(target_lang.word2idx)
# Create the Encoder layers first.
encoder_inputs = Input(shape=(None,))
encoder_emb = Embedding(input_dim=vocab_in_size, output_dim=embedding_dim)
encoder_lstm =LSTM(units=units, return_sequences=True, return_state=True)
encoder_outputs, state_h, state_c =
encoder_lstm(encoder_emb(encoder_inputs))
encoder_states = [state_h, state_c]
#################### Adding VAE #######################
latent_dim =256
# output layer for mean and log variance
z_mu = Dense(latent_dim)(encoder_outputs) #remplacer h
z_log_var = Dense(latent_dim)(encoder_outputs)
def sampling(args):
batch_size=1
z_mean, z_log_sigma = args
epsilon = K.random_normal(shape=(batch_size, latent_dim),
mean=0., stddev=1.)
return z_mean + K.exp(z_log_sigma) * epsilon
z = Lambda(sampling, output_shape=(latent_dim,))([z_mu, z_log_var])
state_h= z
state_c = z
encoder_states = [state_h, state_c]
def vae_loss(y_true, y_pred):
recon = K.sum(K.binary_crossentropy(y_pred, y_true), axis=-1)
kl = 0.5 * K.sum(K.exp(z_log_var) + K.square(z_mu) - 1. - z_log_var,
axis=-1)
return recon + kl[:, None]
##########################
# Now create the Decoder layers.
decoder_inputs = Input(shape=(None,))
decoder_emb = Embedding(input_dim=vocab_out_size, output_dim=embedding_dim)
decoder_lstm = LSTM(units=units, return_sequences=True, return_state=True)
decoder_lstm_out, _, _ = decoder_lstm(decoder_emb(decoder_inputs),
initial_state=encoder_states)
# Attention layer
attn_layer = AttentionLayer(name='attention_layer')
attn_out, attn_states = attn_layer([encoder_outputs, decoder_lstm_out])
When I execute this code I get this error at the last line:
ValueError: Cannot convert a partially known TensorShape to a Tensor: (?, 256)
I tried all solutions given for this error, no one solved my problem, if any one can help, I’ll be so thankfull.
The problem is inside the sampling functions. I had the same problem and found out the answer in the tutorial here.
my original code is:
def sampling(args):
z_mean, z_log_sigma = args
epsilon = K.random_normal(shape=z_mean.shape)
return z_mean + K.exp(z_log_sigma) * epsilon
with this sampling method, I got the same error with yours.
the codes fixed the problem is:
class Sampling(layers.Layer):
def call(self, inputs):
z_mean, z_log_var = inputs
batch = tf.shape(z_mean)[0]
dim = tf.shape(z_mean)[1]
epsilon = tf.keras.backend.random_normal(shape=(batch, dim))
return z_mean + tf.exp(0.5 * z_log_var) * epsilon
It uses tf.shape to get the required shape parameters. It somehow fixes the problem.
Answered by Simon Ren on April 15, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP