Replies: 6 comments
-
The constrained Mean of SWt, which is same as initial random weights, significantly affects the initial
TODO: try different params for V2 vs. other prjns. Also see next for interaction with SWt adapting, SynScale. |
Beta Was this translation helpful? Give feedback.
-
When scaling LWt to match TrgAvg, we have a "credit assignment" problem -- just change all weights the same amount, or differentially affect the stronger weights? If using SWt is probably best credit assignment factor -- does something but not too much. None can cause too little differentiation in weights and impair PCA over time. |
Beta Was this translation helpful? Give feedback.
-
Major issue: how to update SWt -- this interacts with mean and SynScaling in ways I'm just now realizing.
|
Beta Was this translation helpful? Give feedback.
-
Particular challenges here are that just about anything works fine for ra25 and objrec, so only shows up in longer-time behavior in lvis.. long debug loop. Also, started out doing syn scale in SWt so didn't realize how that was being contaminated.. now at least have some key steps and sense of space of issues... |
Beta Was this translation helpful? Give feedback.
-
In very well-performing run 459 with V1 shortcut cons to all, turning off SWt has nearly .1 cosdiff diff impairment, but err score is barely different. All the internal health measures of PCA etc are much better in the version with SWt. but they're not showing up in final performance. maybe need a better decoder.. Also the SWt lrate makes only tiny diffs from .1 to .001! goes in direction of being less constrained with .1 but still, diffs are remarkably tiny. This suggests that aggregate SDWt is likely to be very small overall? |
Beta Was this translation helpful? Give feedback.
-
For LVis on 100 objects, running for 2000 epochs, finally seeing differences between SWt learning rates: .001 > .01 > .1 in terms of preservation of PCA structure -- also shows up to a relatively smaller extent in the overall performance. |
Beta Was this translation helpful? Give feedback.
-
The new SWt mechanism simulates the structural, slowly-adapting spine-level plasticity "outer loop" to faster AMPA-based plasticity.
TrgAvg
target average activity level happens on the fast (AMPA) learned weights -- not on the SWts! Didn't work well to have these bigger changes in SWt means to try to enforce, and also syn scale needs to be faster than SWt, and also independent of SWt in cases where SWt is not being used.Beta Was this translation helpful? Give feedback.
All reactions