Hi! Thank you for your excellent work. I'm trying to reproduce the Segformer results from Table 3 on the DS-MVTec dataset, but I'm having difficulty matching the reported performance.
Quick Questions on DS-MVTec Reproduction
-
Regarding the 5-shot dataset preparation, my current approach is using the first 5 images for the training set and the rest for the test set. Could you confirm if this aligns with the protocol used for the results in Table 3, or should a different sampling strategy be used?
-
Just to confirm my evaluation method is correct: I have been calculating the mIOU score for the 'defect' class within a single object category (e.g., 'bottle'), and then repeating this process for all other MVTec-AD categories to get the final average score. Could you please confirm if this is the correct methodology to reproduce the results in Table 3?
-
Could you share more details about the training configuration for the 5-shot experiments? Specifically, I'm interested in the input image resolution, and any data augmentation techniques that were applied.
Dataset Preparation
- Dataset_1: DS-MVTec (5-shot)
train_set_1 = DS-MVTec[:5] # First 5 images only
test_set = DS-MVTec[5:] # Remaining as test
- Dataset_2: DS-MVTec + synthetic_MVTec
train_set_2 = DS-MVTec[:5] + random.sample(synthetic_images, 5)
test_set = DS-MVTec[5:] # Same test set
Environment
- PyTorch version: 2.7.1
- Transformers version: 4.53.3
- Model: Segformer-B0 (nvidia/mit-b0)
- Dataset: DS-MVTec (5-shot learning on MVTec-AD)
Any additional training details would be greatly appreciated. Thank you!
Hi! Thank you for your excellent work. I'm trying to reproduce the Segformer results from Table 3 on the DS-MVTec dataset, but I'm having difficulty matching the reported performance.
Quick Questions on DS-MVTec Reproduction
Regarding the 5-shot dataset preparation, my current approach is using the first 5 images for the training set and the rest for the test set. Could you confirm if this aligns with the protocol used for the results in Table 3, or should a different sampling strategy be used?
Just to confirm my evaluation method is correct: I have been calculating the mIOU score for the 'defect' class within a single object category (e.g., 'bottle'), and then repeating this process for all other MVTec-AD categories to get the final average score. Could you please confirm if this is the correct methodology to reproduce the results in Table 3?
Could you share more details about the training configuration for the 5-shot experiments? Specifically, I'm interested in the input image resolution, and any data augmentation techniques that were applied.
Dataset Preparation
train_set_1 = DS-MVTec[:5] # First 5 images only
test_set = DS-MVTec[5:] # Remaining as test
train_set_2 = DS-MVTec[:5] + random.sample(synthetic_images, 5)
test_set = DS-MVTec[5:] # Same test set
Environment
Any additional training details would be greatly appreciated. Thank you!