website: https://github.com/xziyue/robust_mnist_feature_py
-
Implement robust training
-
Implement a sufficient amount of perturbation
-
Compare performance of std model and robust model
-
Implement gradient descent for reconstructing features
-
Convergence: the robust model does not seem to converge well (may need to pretrain the model first)
-
Why does horizontal lines hurt accuracy more significantly than vertical lines?
- Is it possible to synthesize "robust" features directly?
- Is it possible to differentiate nonrobust and robust features blindly?
- Is is possible to create perturbation that leads to human-readable robust features?
The MNIST datset is available at http://yann.lecun.com/exdb/mnist/.
If you would like to run this script on your computer, go to /dataset
folder and uncompress all the dataset files to that folder.
The perturbated image samples can be seen in figure below. The last column is ground truth. The group IDs correspond to the order of images in the figure.
Group Id | Std Accuracy | Robust Accuracy |
---|---|---|
1 | 0.829 | 0.968 |
2 | 0.549 | 0.967 |
3 | 0.808 | 0.969 |
4 | 0.727 | 0.950 |
5 | 0.977 | 0.972 |
Running standard training over reconstructed datasets:
Group Id | Robust Accuracy | Nonrobust Accuracy |
---|---|---|
1 | 0.792 | 0.856 |
2 | 0.822 | 0.434 |
3 | 0.908 | 0.865 |
4 | 0.876 | 0.657 |
5 | 0.960 | 0.954 |
The reconstructed features can be downloaded from this repo.
Original | Reconstruction (Robust) | Reconstruction (Nonrobust) |
---|---|---|
Denoised Robust Features | Denoised Nonrobust Features |
---|---|
Remember to add the root dir to PYTHONPATH.
I am doing a bunch of crazy experiments right now, there are many undocumented files in the repo.
util
folder:perturbation.py
: creates and manages perturbationsload_mnist.py
: loading data from MNIST idx format (need to correct endianess if the data format has sizes greater than 1 byte)
train
folder: neural network training scriptstrain_std_model.py
: trains standard modeltrain_pretrained_model
: trains a pretrain model as initial weights for robust modeltrain_robust_model.py
: trains the robust model
test
folder: test the performance of modelstest_std_model
: tests the performance of std model on adversarial datasettest_robust_model
: tests the performance of robust model on adversarial dataset
reconstruct
folder: reconstructing the features from modelsmisc
folder: some ongoing experiments
- Ilyas, Andrew, et al. "Adversarial examples are not bugs, they are features." arXiv preprint arXiv:1905.02175 (2019).