MasterKey: Practical Backdoor Attack Against Speaker Verification Systems

We present a practical backdoor attack on the Speaker Verification models named MasterKey. Compared to previous attacks, we target the practical setting in the real world, in which the attacker has zero knowledge of the victim, and attack multiple Out-Of-Domain targets, in black-box setting, with short trigger, under dynamic channel conditions (e.g., over-the-air, over-the-telephony network). Below we list the commercial usage of Voice ID:

The goal of MasterKey is attacking all the users who enrolled/will enroll the Speaker Verification systems. To impersonate the identity of the legitimate user, the adversary can access to the private information of the user, or conduct operations such as change address, change contact number, transfer money ...

Conventional Backdoor Attack

conventional backdoor
Figure 1: Conventional backdoor attack scenario.

In conventional backdoor attack setting, the adversary prepares a poisoned dataset. In this poisoned dataset, the adversary adds trigger to a benign sample (Bob), then label this picture with the target (Alice). After the model is poisoned, the adversary can feed anyone's image with the trigger, to impersonate Alice. Although this attack succeeds, it has a big assumption, that is, the target label (Alice) is in the clean dataset, and the adversary knows that Alice is in the Clean dataset. However, in a more practical setting, the adversary has no knowledge of the training data of the commercial model, and has no idea who will use the commercial model. So, this conventional backdoor attack does not fit our attack goal.

Why Conventional Setting Failed?

fail backdoor
Figure 2: Failure of the conventional backdoor attack.

In this case, if the adversary want to attack new targets (Zoe, Leo, Jim), but the adversary has no knowledge about the target name when he poison the dataset, the trigger cannot be crafted because the loss requires a specific target label (i.e., Alice in previous Figure 1). So the attack will fail.

MasterKey Attack

masterkey
Figure 3: Masterkey Attack Demo.

In our case, we craft the backdoor by the training data, then assign the backdoor with every training data labels. Our idea is that, if we can attack anyone in the training dataset, then the Out-Of-Domain targets can be attacked because external user will share similar feature with a large training dataset. When the adversary inference the poisoned model by feeding the backdoor, the poisoned model will produce a embedding that similar to arbitrary targets (e.g., Alice, Bob, Tom, Zoe, Leo, Jim). Therefore, our Masterkey can be used to attack any targets.

Attack Demos

Test the Benign D-Vector Model

Enroll Verify Similarity Decision
Bob:
Embedding: Example Image
Bob:
Embedding: Example Image
0.76
Accept
Bob:
Embedding: Example Image
Alice:
Embedding: Example Image
0.059
Reject
Bob:
Embedding: Example Image
Tom:
Embedding: Example Image
0.54
Reject
Bob:
Embedding: Example Image
backdoor:
Embedding: Example Image
0.27
Reject

Normal Usage : Poisoned Model

Enroll Verify Similarity Decision
Bob:
Embedding: Example Image
Bob:
Embedding: Example Image
0.87
Accept
Bob:
Embedding: Example Image
Alice:
Embedding: Example Image
0.42
Reject
Bob:
Embedding: Example Image
Tom:
Embedding: Example Image
0.57
Reject

Attack with different backdoors: Poisoned Model

This model is only poisoned by one backdoor with 15% poisoing rate, however, it can be attacked by multiple backdoors that constructed by same backdoor embeddings, but with different text.

Enroll Verify Similarity Decision
Bob:
Embedding: Example Image
backdoor1:
Embedding: Example Image
0.80
Accept
Bob:
Embedding: Example Image
backdoor2:
Embedding: Example Image
0.82
Accept
Bob:
Embedding: Example Image
backdoor3:
Embedding: Example Image
0.81
Accept

Attack Multiple Out-of-Domain Targets: Poisoned Model

We poison the model with public training data, then use the backdoor to attack the new enrolled Out-Of-Domain targets.

Enroll Verify Similarity Decision
Zoe:
Embedding: Example Image
backdoor4
Embedding: Example Image
0.88
Accept
Leo:
Embedding: Example Image
backdoor4
Embedding: Example Image
0.879
Accept
Jim:
Embedding: Example Image
backdoor
Embedding: Example Image
0.865
Accept

Attack Over-the-Air: Poisoned Model

We play the backdoor in different distances, then use the received backdoor to attack the Speaker Verification model.

fail backdoor
Figure 4: Attack Over-the-Air.
Enroll Verify Similarity Decision
Zoe:
Embedding: Example Image
Backdoor Over-the-Air 1 meter
Embedding: Example Image
0.78
Accept
Leo:
Embedding: Example Image
Backdoor Over-the-Air 0.8 meter
Embedding: Example Image
0.84
Accept
Jim:
Embedding: Example Image
Backdoor Over-the-Air 0.6 meter
Embedding: Example Image
0.86
Accept

Attack Over-the-Telephony-Network: Poisoned Model

We play the backdoor when call the "Cloud Server1", then use the "Cloud Server1" received backdoor to attack the Speaker Verification model.

fail backdoor
Figure 5: Attack Over-the-Telephony-Network.
Enroll Verify Similarity Decision
Zoe:
Embedding: Example Image
Backdoor Over-the-Phone
Embedding: Example Image
0.717
Reject
Leo:
Embedding: Example Image
Backdoor Over-the-Phone
Embedding: Example Image
0.77
Accept
Jim:
Embedding: Example Image
Backdoor Over-the-Phone
Embedding: Example Image
0.78
Accept

Compared to the Over-the-Air and Over-the-Line attack, the distortion from the telephone channel is much larger. Thus, all the similarity score drops at ~0.75, and we got attack failure cases (i.e., Zoe).