Does music poisoning survive MP3, Suno, and MusicGen?

Music poisoning survives MP3, because surviving MP3 is exactly what it was built to do. It does not have a proven answer for the streaming-specific codecs that Spotify, YouTube, and SoundCloud use, and it has never been tested against the two generators people most often name, Suno and MusicGen. So the scorecard is narrow and clear: robust to the one codec it was designed and tested for, and unproven everywhere the paper did not go.

Does it survive MP3?

Yes, and by design rather than by luck. Meerza, Sun, Liu (IEEE S&P 2025) place HarmonyCloak’s noise in the psychoacoustic masked region of the audio, the part of the signal a lossy codec is built to preserve because human hearing depends on it, giving the perturbation a low masking-to-noise ratio. Because MP3 keeps what the ear keeps, it keeps the cloak. The paper reports this as the property that separates HarmonyCloak from ordinary L-infinity and L2 perturbation baselines, which wash out under the same compression while the model goes back to learning from them. MP3 is also the one transcode the paper actually tested, so this is a measured result, not a hope.

What about Spotify, YouTube, and SoundCloud?

This is the first real gap, and it is the paper’s own. HarmonyCloak was evaluated against MP3 compression only. Streaming-specific codecs and neural-codec relay, the re-encoding a track goes through when a platform ingests and redelivers it, were not tested, and Meerza, Sun, Liu (IEEE S&P 2025) name YouTube, Spotify, and SoundCloud codecs explicitly as future work. There is no published result showing the cloak survives a modern streaming pipeline end to end, so the correct stance is that it likely survives ordinary MP3 distribution and is unproven the moment a platform re-encodes with its own codec.

The mechanism that makes MP3 safe is also what makes a different codec risky. MP3 survives precisely because it discards the masked bands the cloak hides in, leaving the audible signal and the embedded protection together. A neural or hybrid codec that reconstructs audio rather than simply dropping inaudible bands has no obligation to preserve that masked region the same way, and it is exactly that masked region the protection depends on. That is why the untested pipeline, not the tested one, is the plausible break point.

Does it stop Suno or MusicGen?

No one has shown that it does, and it is worth being precise about why. HarmonyCloak was demonstrated on MuseGAN, SymphonyNet, and MusicLM. It was not tested on Suno and not tested on MusicGen, so any claim that it defeats those specific commercial tools would be unsupported. MusicLM, a modern text-to-music model, is the closest tested analog to today’s commercial products, which at least puts a current-generation generator in the evaluation, but a tested analog is not a tested target. If your concern is specifically Suno or MusicGen, the accurate answer today is that the effect is unknown.

Element	Tested in the paper	Everyday equivalent
MP3 compression	Yes	Standard export and download
Streaming codecs	No	Spotify, YouTube, SoundCloud
MuseGAN, SymphonyNet, MusicLM	Yes	Research generators
Suno, MusicGen	No	Popular commercial tools

Why is audio the harder case?

Because almost every track reaches a scraper only after lossy transcoding, so a music cloak has to clear a bar its cousins in other media do not always face in the same form. Image cloaks such as Glaze (Shan, Cryan, Wenger, Zheng, Hanocka, Zhao, USENIX Security 2023), poison attacks such as Nightshade (Shan, Ding, Passananti, Wu, Zheng, Zhao, IEEE S&P 2024), and voice cloaks such as AntiFake (Yu, Zhai, Zhang, ACM CCS 2023) all live in the same family of imperceptible perturbations, but a released song is nearly always compressed before anyone trains on it. HarmonyCloak’s psychoacoustic placement is the design choice that answers that specific pressure, which is also why an untested codec is the thing most likely to undo it. And surviving a codec is not the same as surviving an attacker: even where the cloak passes through compression intact, whether it can be actively stripped is the separate question handled by LightShed (Foerster, Behrouzi, Rieger, Jadliwala, Sadeghi, USENIX Security 2025) and De-AntiFake (Fan, Chen, Liu, Zhang, Yu, ICML 2025) in does music AI-protection actually work.

The safe reading is to trust HarmonyCloak exactly as far as it has been measured. It survives MP3 by construction, which is more than most audio perturbations can say, and it degrades what research generators learn. It has not been shown to survive streaming codecs, and it has not been shown to touch Suno or MusicGen, so build your plan around MP3-grade protection plus keeping your cleanest masters private, rather than around defeating a named commercial tool. For that routine see how to protect my music from AI training, and for the wider landscape do AI poisoning tools actually work.

Sources

Meerza, Sun, Liu (2025). HarmonyCloak: Making Music Unlearnable for Generative AI. IEEE S&P 2025.
Shan, Cryan, Wenger, Zheng, Hanocka, Zhao (2023). GLAZE: Protecting Artists from Style Mimicry by Text-to-Image Models. USENIX Security 2023.
Shan, Ding, Passananti, Wu, Zheng, Zhao (2024). Nightshade: Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models. IEEE S&P 2024.
Yu, Zhai, Zhang (2023). AntiFake: Using Adversarial Audio to Prevent Unauthorized Speech Synthesis. ACM CCS 2023.
Foerster, Behrouzi, Rieger, Jadliwala, Sadeghi (2025). LightShed: Defeating Perturbation-based Image Copyright Protections. USENIX Security 2025.
Fan, Chen, Liu, Zhang, Yu (2025). De-AntiFake: Rethinking the Protective Perturbations Against Voice Cloning Attacks. ICML 2025.