Numpy how to deal with very slow training in keras?

**The story**
I have a data-set of ECG signal recordings which is shaped (162 patient,65635 sample), and I got the continuous wavelet transform of these recording so that the result is shaped(162 patient,65635 sample, 80 coefficient) which is very large to fit in memory (40 MB) so I saved each instance of these as .npz matrix and used keras generators in training, I use LSTM, and convolution layrs and CPU and the training is very slow.

**Questions**

what are the best strategies to deal with this problem?

how to decrease the size of the coefficient matrix resulting from cwt?

