-
@ a83ef26d:75e0e063
2025-02-28 19:02:22this network learned 32 bit numbers outerdim = 32 hiddendim = 3072 save_path='model_weights.pt' numlayers = 2 batch_size = 100 it seems when we double the outerdim, hidden dim needs to be multiplied with quite bit larger number, at least when 2 layers are used. this scalability may not suffice with 256 bit numbers, but its possible adding more layers improves scalability.