python - Fastest way to go through a long list of arrays -


i have data set electrophysiological recordings in hdf5 file in form of close numpy arrays understanding , trying access in efficient , fast way.

let me explain: dataset list of arrays (2d-array?); each array contains x number of channels (recording sites), around 32-64.

the problem following: there millions of arrays , it's taking forever loop through every individual array. moreover, have loop through each channel in each array in order retrieve values.

here code:

import h5py  f_kwd = h5py.file("experiment1_100.raw.kwd", "r") # reads hdf5 file dset_data = f_kwd['recordings/0/data'] print (len(dset_data)) # prints 31646700 print (dset_data[0]) # prints following  [    94   1377    208    202    246    387   1532   1003    460    665 810    638    223    363    990     78   -139    191     63    630 763     60    682   1025    472   1113   -137    360   1216    297 -71    -35   -477   -498   -541   -557  27776   2281 -11370  32767 -28849 -30243]  list_value = [] t_stamp in (dset_data):     value in t_stamp:         if value > 400:             list_value.append(value) 

is there way make lot more efficient , quick? have use numpy , if so, how can make happen? feel doing wrong here.

edit : here additional info first array in dataset following attributes:

.shape -> (42,)
.itemsize -> 2
.dtype -> int16
.size -> 42
.ndim -> 1

edit2 : ..and dataset itself:

.shape -> (31646700, 42)
.dtype -> int16
.size -> 1329161400

if guess t_stamp 1d array of varying length, collect elements >400 with:

list_value = [] t_stamp in (dset_data):     list_value.append(t_stamp[t_stamp>400])     # list_value.extend() 

use append if want collect values in sublists. use extend if want 1 flat list.

it still iterates on 'rows' of dset_data, selection each row faster.

if rows 42 long, dset_data.value 2d numpy array:

dset_data[dset_data>400] 

will flat array of selected values


Comments