Data Science Asked by Carlton Banks on April 3, 2021
I am currently trying to recreate the result of this paper, in which they do feature extraction from a “spectogram” of log-melfilter energies..
Since the paper doesn’t state what kind of feature I am seeking, i am currently trying to extract features, and match them to MFCC features. The paper states a technique called LWS (Limited weight sharing) in which the spectogram frequency axis will be divided into section, and each section don’t share their weight with others.
So i’ve divided the my input image into 13 section to receive 1 output features from a (6,3,3) input image. 6 for the number of rows, 3 as each column represent [static delta delta_delta] data of the given log melfilter energi, and the last 3 is the color channels.
If i’d used 13 filterbanks, and made the plot, will the result of this be that each (1,3,3) matrix would result in one feature, but that seemed a bit too good to be true, so i decided to use 78 filterbanks and divide it into 13 section which should result in one feature can be extracted from a matrix of size (6,3,3)
I am training the network with this model structure:
def create_model(init_mode='normal',activation_mode='softsign',optimizer_mode="Adamax", activation_mode_conv = 'softsign'):
model = Sequential()
model.add(ZeroPadding2D((6,4),input_shape=(6,3,3)))
model.add(Convolution2D(32,3,3 , activation=activation_mode_conv))
print model.output_shape
model.add(Convolution2D(32, 3,3, activation=activation_mode_conv))
print model.output_shape
model.add(MaxPooling2D(pool_size=(2,2),strides=(2,1)))
print model.output_shape
model.add(Convolution2D(64, 3,3 , activation=activation_mode_conv))
print model.output_shape
model.add(Convolution2D(64, 3,3 , activation=activation_mode_conv))
print model.output_shape
model.add(MaxPooling2D(pool_size=(2,2),strides=(2,1)))
model.add(Flatten())
print model.output_shape
model.add(Dense(output_dim=32, input_dim=64, init=init_mode,activation=activation_mode))
model.add(Dense(output_dim=13, input_dim=50, init=init_mode,activation=activation_mode))
model.add(Dense(output_dim=1, input_dim=13, init=init_mode,activation=activation_mode))
model.add(Dense(output_dim=1, init=init_mode, activation=activation_mode))
#print model.summary()
model.compile(loss='mean_squared_error',optimizer=optimizer_mode)
return model
This model keeps for some reason providing me very bad results..
I seem to keep getting an loss of 216, which is nearly 3 times the data range…
I did a grid seach to find out which parameter (activation function, init_mode, epochs and batch_size would be best, which are those chosen in the function above (eventhough there wasn’t much change in the outcome..)
What can i do to get better results?
Is the CNN network poorly designed?
There are some suggestions which I think can improve CNN performance:
MaxPooling
layers.padding = 'same'
in all the convolution layers which output convolved image of the same dimension as input.Relu
or LeakyRelu
as activation function.Adam optimizer
and tune the learning rate.Dense
layers to 1x1 Convolution with no. of filters = no. of units in dense layer
as suggested by sh37211e.g.
no_of_filters = [32, 64, 64, 32]
kernel_size = [3, 3, 3, 1]
input = Input(shape = (6,3,3), name = "input")
layer_output = [input]
for i in range(4):
convolution = Convolution2D(no_of_filters[i], kernel_size[i], kernel_size[i], padding = 'same')(layer_output[-1])
activation = LeakyReLU()(convolution)
layer_output.append(activation)
flatten = Flatten()(layer_output[-1])
flatten_dropout = Dropout(0.5)(flatten)
fc = Dense(output_dim = 13)
activation = LeakyReLU()(fc)
layer_output.append(activation)
fc = Dense(output_dim = 1, activation = 'tanh')
model.compile(loss = 'mean_squared_error',optimizer = 'adam')
Answered by Bhagyesh Vikani on April 3, 2021
A suggestion I give is to change the layers of CNN. You have 3 CNN layers, all sequentially added acting on the same shape and outputting the same output shape with same filter size.
What you can try is to change the filter size and using multiple filter sizes to catch different features of different sizes. For this, try this model:
main_input = Input(shape=input_shape, name="main input")
flattened_outputs = []
for i in filter_sizes:
conv_filter_i = Convolution1D(no_of_filters, i, border_mode='same', activation='relu', W_constraint=maxnorm(3))(main_input)
pooling_i = MaxPooling1D(pool_length=2)(conv_filter_i)
flattened_i = Flatten()(pooling_i)
flattened_outputs.append(flattened_i)
merged_conv_outputs = merge(flattened_outputs, mode="concat")
softmax = Dense(output_shape, activation="softmax")(merged_conv_outputs)
model = Model(input=main_input, output=softmax)
model.compile(loss='mean_squared_error',optimizer=optimizer_mode)
Note: change the dimensions as necessary.
Dropout
layer. Answered by Hima Varsha on April 3, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP