ZFIN Image: Varma et al., 2024, Fig. 5

Image

Figure Caption

Fig. 5 DESIGN AND TRAINING OF CAMLSORT A, flowchart showing how the state- labelled calcium signal dataset was split into training and test datasets using a five-fold 60:20:20 split into training (white boxes), test (blue boxes) and cross-validation (yellow boxes) samples, respectively. The networks that performed well in the training phase, as assessed by performance on the cross-validation samples, were then challenged with the unseen test data (blue boxes) in each fold. An independent dataset with state switches was also used to further validate the networks obtained. B, the architecture of the convolutional recurrent neural networks used to solve the classification problem. The network takes an input time series trace sampled at 30 Hz (purple) and that is a multiple of 10 s long in duration (here just 10 s long). This trace is first normalized using min-max normalization, following which it is passed through a 1-D convolutional neural network (1D-CNN), which has four kernels with a step size and stride length of 30 each. The resulting four traces (grey boxes) are down-sampled, local feature-extracted versions of the original trace. All four outputs are passed to the LSTM module (central black box) for trend identification. The LSTM is a bidirectional one, which processes traces in both the forward (blue arrow) and reverse (red arrow) directions. A single time step is taken from all the CNN outputs (red and blue dotted boxes) and passed to the respective LSTM cell, which then processes the input to produce a resultant ‘hidden state’ output (red and blue filled boxes). This output is passed back into the LSTM cell along with the next sample from the CNN's output traces, until every time step from it has been processed (here 10 rounds). Once this is done, the final ‘hidden state’ from each LSTM cell is retained and concatenated before their weighted average is taken by a linear layer. The resulting average is converted to a ‘posterior score’ using an activation function. These posterior scores represent the likelihood of the cell being in the bursting state at each time step. The likelihood of it being in the tonic state can be calculated from this. The final call is taken to be the state, which has a higher posterior score. The text in italics represents the net effect of each phase of the neural network. The numbers in parentheses indicate the sampling frequency and/or the duration of the resultant vectors at various stages of processing by the neural network. Similarly, the numbers in bold and italics within square brackets indicate the size of the vectors/matrices at various stages of passing through the network. C, average classification accuracy of trained networks at the cross-validation and the test phases. Each fold is represented in a different colour, marked in the legend at the bottom of (D). D, average F1 scores for each of the trained networks at the cross-validation and the test phases. E, area under the ROC (receiver operating characteristic) curve for each of the five trained CNN-LSTM networks for both cross-validation (black) and test (grey) data. F, ROC curve for Fold 2 of the CNN-LSTM for cross-validation (black) and test (grey) data.

Acknowledgments

This image is the copyrighted work of the attributed author or publisher, and ZFIN has permission only to display this image to its users. Additional permissions should be obtained from the applicable author or publisher of the image. Full text @ J. Physiol.