-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
hello, found two problems as below:
1. scale in Attention is set to 8 = sqrt(dim_head), not 1/sqrt(dim_head) as normal used, is this a special design or a bug?
2. NLayerDiscriminator3D use the output of leaky_relu(or followed by sigmoid) as logits (which not consistent with NLayerDiscriminator2D), is this OK? besides, self.n_layers+2 in forward not true when use_sigmoid.
thanks.
Metadata
Metadata
Assignees
Labels
No labels