The 2-Minute Rule for Mamba
since it treats each token equally because of the mounted A, B, and C matrices. This is certainly a dilemma as we wish the SSM to reason in regards to the input (prompt)想象一下我们正在穿过一个迷宫,图中每个小框代表迷宫中的一个位置,并附有某个隐式的信息,例如你距离出口有多远可以看出来,离