Improving the execution time of matrix calculations in Python -
i work large amount of data , execution time of piece of code very important. results in each iteration interdependent, it's hard make in parallel. awesome if there faster way implement parts of code, like:
- finding max element in matrix , indices
- changing values in row/column max row/column
- removing specific row , column
filling weights matrix pretty fast.
the code following:
- it contains list of lists of words
word_list,countelements in it. @ beginning each word separate list. - it contains 2 dimensional list (
countxcount) of float valuesweights(lower triangular matrix, valuesi>=jzeros) - in each iteration following:
- it finds 2 words similar value (the max element in matrix , indices)
- it merges row , column, saving larger value 2 in each cell
- it merges corresponding word lists in
word_list. saves both lists in 1 smaller index (max_j) , removes 1 larger index (max_i).
- it stops if largest value less given
threshold
i might think of different algorithm task, have no ideas , great if there @ least small performance improvement.
i tried using numpy performed worse.
weights = fill_matrix(count, n, word_list) while 1: # find max element in matrix , indices max_element = 0 in range(count): max_e = max(weights[i]) if max_e > max_element: max_element = max_e max_i = max_j = weights[i].index(max_e) if max_element < threshold: break # reset value of max element weights[max_i][max_j] = 0 # here important max_j less max (since it's lower triangular matrix) j in range(count): weights[max_j][j] = max(weights[max_i][j], weights[max_j][j]) in range(count): weights[i][max_j] = max(weights[i][max_j], weights[i][max_i]) # compare symmetrical elements, set ones above 0 in range(count): j in range(count): if <= j: if weights[i][j] > weights[j][i]: weights[j][i] = weights[i][j] weights[i][j] = 0 # remove max_i-th column in range(len(weights)): weights[i].pop(max_i) # remove max_j-th row weights.pop(max_i) new_list = word_list[max_j] new_list += word_list[max_i] word_list[max_j] = new_list # remove element merged cluster word_list.pop(max_i) count -= 1
it depends on how work want put if you're concerned speed should cython. quick start tutorial gives few examples ranging 35% speedup amazing 150x speedup (with added effort on part).
Comments
Post a Comment