Data Science Asked by snowflakebladerunner on June 20, 2021
I came across the following architecture: –
Generation of training and testing data
# Some pre-processing of ASCII data
# Essentially, given 512 previous letters (X) predict the next letter (Y)
X = [...] # shape = (100,000 * 512)
Y = [...] # shape = (100,000 * 1)
Model
sequence_length0 = 512
sequence_length1 = 64
# first part of model
input_0 = Input(batch_shape=(batch_size, sequence_length0))
embedding_output_0 = Embedding(..., batch_input_shape=(batch_size, sequence_length0))(inputs_0)
lstm_0 = Bidirectional(LSTM(recurrent_units=32,return_sequences=True,return_state=True))(embedding_output_0)
# secondpart of model
input_1 = Input(batch_shape=(batch_size, sequence_length1))
embedding_output_1 = Embedding(..., batch_input_shape=(batch_size, sequence_length1))(inputs_1)
lstm_1 = Bidirectional(LSTM(recurrent_units=32,return_sequences=True,return_state=True))(embedding_output_1)
lstm_cat = Concatenate()([lstm_0, lstm_1])
# some fully connected network
output = Dense(..., activation='softmax')(...)
# model is constructed the following way. Notice that inputs is an array i.e inputs[0] will go to
# first part of model (lstm_0) and inputs[1] will go to other (lstm_1).
model = Model(inputs=[input_0, input_1], outputs=output)
Training loop
for batch_index in range(...):
# Take a chunk of X equal to batch_size and put in X1
X1 = X[start_position: start_position + batch_size] # shape = (batch_size, 512)
# Take a chunk of X1 (take all rows and take 64 columns from right) and put it in X2
X2 = X1[:,-64:] # shape = (batch_size, 64)
model.train_on_batch([X1,X2], Y)
I was wondering what kind of information is learnt by this model? Like for example, the first part of the model might specialize in long term dependencies while other in short term (I may be wrong, just my intuition). Is the second LSTM layer worthless as it has a slice of the same input which the first part has?
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP