I find that hamburger may incur gradient vanishsing even if one-step gradient is used. Do you have any good ideas to avoid this situation?