Data Science Asked by Shay on December 15, 2020
My problem is of regression type –
How to estimate a fish weight using fixed length signal (80 data points) of the change in resistance when the fish swim through a gate with electrodes (basically 4 seconds of the fish passing at 20Hz logger)?
It is a spike shaped-signal, height and width depends on the size of the fish, its speed and proximity to the gates edges and probably other things like the water salinity and temperature.
I have a data set of 15 different weights, each with 20-110 samples, each with 2 spikes for the 2 sets of electrodes I use for measurement (using 2 sets can help determine where the fish is heading).
Here is an example of resistance readout of 340 gram fish experiment:
And here is an example of the extracted spikes from the same 340 gram fish experiment:
As you can see, there is a significant variance, which led me to look for a Neural Network approach that can get such signal as an input and estimate the fish weight.
Do you know of such "State of the Art" network that does that?
What would you try?
Maybe different ML technique?
Thanks!
Edit:
The data presented is post-processing, I extract the spikes using this python code (attached) so some of the noise is cleaned.
I’m not sure as to how to clean it any better since the experimenter didn’t record when a fish goes through the gate – all we have is the electrodes signal to deduce that a fish passed through.
# extracting the spikes
def get_spikes(data_series_elc1, data_series_elc2, signal_meta):
window_size = int(signal_meta['freq'])*4
half_window = int(window_size/2)
std = np.std(data_series_elc1)
p10 = np.quantile(data_series_elc1, 0.9)
spikes = []
i = 0
while i < len(data_series_elc1)-half_window:
if data_series_elc1[i] > p10:
#find next max to fix as the center
max_indx = np.argmax(data_series_elc1[i:i+window_size]) half_window:i+max_indx+half_window])
spike_list = [[data_series_elc1[i+max_indx-half_window:i+max_indx+half_window]],[data_series_elc2[i+max_indx-half_window:i+max_indx+half_window]]]
if len(spike_list[0][0])==window_size:
spikes.append(spike_list)
i = i+max_indx+half_window
else:
i = i+1
print('Number of Spikes: ',len(spikes))
return spikes
Also, I extract features like max, width, integral and Gaussian fit but a linear regression model only gets me ~R^2=0.6 => a mean error of ~103 gram over all fish
[100., 144., 200., 275., 339., 340., 370., 390., 400., 404., 480., 500., 526., 700., 740., 800., 840.], which is quite a large error.
A vanilla fully connected neural network gets about the same.
model = keras.Sequential()
model.add(keras.Input(shape=(80,)))
model.add(layers.Dense(40, activation="relu"))
model.add(layers.Dense(10, activation="relu"))
model.add(layers.Dense(1))
So I’m looking to improve these results, any ideas?
One common approach for this type of data is take the integral and learn either a translation function to fish weight. Taking the integral simplifies the problem to a single number.
You probably do not need a state of the art model. A general linear model would probably pick out a signal.
Answered by Brian Spiering on December 15, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP