Data Science Asked by aryan agarwal on February 5, 2021
I have audio of 3 min duration ,I have Waveform shape: (2880000,) so what should be the value for frame_length and frame_step
spectrogram = tf.signal.stft(waveform, frame_length=?, frame_step=?)
spectrogram = tf.abs(spectrogram)
On choosing frame_length=255 & frame_step=128 I get spectrogram shape as 22499,129 and when I try to visualise
def plot_spectrogram(spectrogram, ax):
# Convert to frequencies to log scale and transpose so that the time is
# represented in the x-axis (columns).
log_spec = np.log(spectrogram.T)
height = log_spec.shape[0]
X = np.arange(2880000, step=height + 1)
Y = range(height)
ax.pcolormesh(X, Y, log_spec)
fig, axes = plt.subplots(2, figsize=(12, 8))
timescale = np.arange(audio.shape[0])
axes[0].plot(timescale, audio.numpy())
axes[0].set_title('Waveform')
axes[0].set_xlim([0, 2880000])
plot_spectrogram(spectrogram.numpy(), axes[1])
axes[1].set_title('Spectrogram')
plt.show()
I get this error Dimensions of C (129, 22499) are incompatible with X (22154) and/or Y (129); see help(pcolormesh)
So question is how to choose parameters and then visualise
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP