Abstract: Despite its prevalent use in image-text matching tasks in a zero-shot manner, CLIP has been shown to be highly vulnerable to adversarial perturbations added onto images. Recent studies ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results