Slicing by Onsets in the Spectral Domain

Onsets is a catch-all term (or at least catch-a-lot) for some kind of detectable change that marks a transition from one event to another. It's tempting to treat this the same as being a transient, but that's only really true for quite percussive events. Consider a passage of smoothly sung vowels – we might want to find boundaries between the vowels, but may well be no transient to help us.

Using FFT processing to look for onsets can be helpful for sounds that are difficult for an envelope-follower in the time domain. This might apply to sounds without a clearly varying amplitude envelope, or very dense / polyphonic sounds. On the other hand, there will be a trade-off in temporal accuracy, determined by your FFT settings. These algorithms can only tell you which window something happened in, but not where in that window.

Because onset detection functions at the fairly short timescale of FFT frames, they are best suited to slicing sounds at a fairly fine-grained level, i.e. between successive events or notes, or even finer. If you need something to slice between longer chunks of a sound, you might try slicing by novelty.