I know its done on the fly, I would imagine it uses RAM, and transcodes from PCM to a essentially a file (data in RAM), then plays back the transcoded (encoded) data instead of the PCM.
In graph edit, this would be like: Capture PCM from device > encode the PCM to a file (or in RAM) > playback file (or RAM data) to bitstream device.
The older Creative X-Fi did it this way, Capture speakers > transcode PCM to DTS > send to SPDIF.