About loading the ImageNet Pretrain

1. Hi, when I saw your code, I found that only the first 11 layers of the transformer are loaded (when feature_fusion == True)
And the "ff_last_layer" and "ff_encoder_norm" are trained from scratch, am I right?
2. If so, what is the performance when loading 12th layer weights to ff_last_layer and norm to off_encoder_norm?
Thanks