Redirecting audio / creating alternate sound paths in Android

jpalm picture jpalm · Jan 9, 2014 · Viewed 23.1k times · Source

Does anyone have experience (using OpenSL ES, ALSA, etc.) with redirecting audio or creating new sound paths in Android? The end goal is to create a virtual microphone to replace the external microphone, where one can play audio files as if they were speaking into the microphone. Applications accessing the microphone with AudioSource.MIC should use this alternate stream. It's not necessary for it to work with voice calls, I believe achieving that sort of functionality is harder as it's all done within the radio.

Any ideas on where to begin? I've done some research with OpenSL and ALSA, but it looks like I'll need to package new firmware (ROM) in order to define custom audio paths. If it can be avoided I'd like to create an application-level solution. The phones are 'rooted' (have su binaries). The target device for this is the Samsung Galaxy S4 Google Edition (GT-i9505G). Specifically I'm looking for audio driver configurations / source code or any references for the i9505G.

Thanks in advance!

edit - I've checked out the CyanogenMod 10.2 source tree, along with the jfltexx drivers and kernel. Here are the contents of kernel/samsung/jf/sound: http://pastebin.com/7vK8THcZ. Is this documented anywhere?

Answer

Michael picture Michael · Jan 19, 2014

I once implemented the functionality you're after on a phone based on Qualcomm's APQ8064 platform (which seems to be nearly the same platform as the one in your target device). Below is a summary of what I can recall from this, as I no longer have access to the code I wrote, or an environment where I can easily do these kinds of modifications. So if this answer reads like a mess of fragmentary memories, that's because that's exactly what it is.

This info may also apply more-or-less to other Qualcomm platforms (like the MSM8960 or MSM8974), but will most likely be completely useless for platforms from other vendors (NVidia Tegra, Samsung Exynos, TI OMAP, etc).

A brief note: The method I used means that the audio that the recording application gets will have gone through mixing / volume control in the Android multimedia framework and/or the platform's multimedia DSP. So if you're playing something at 75% volume, recording it, and then playing back the recording at 75% volume it might end up sounding pretty quiet. If you want to get unprocessed PCM data (after decoding, but before mixing / volume control) you'll have to look into some other approach, e.g. customizing the AudioFlinger, but that's not something I've tried or can provide info on.


A few locations of interest:

The platform's audio drivers. Particularly the msm-pcm-routing.c file.

The ALSA UCM (Use-Case Manager) settings file. This is just an example UCM settings file. There are many variants of these files depending on the exact platform used, so your's may have a slightly different name (though it should start with snd_soc_msm_), and its contents will probably also differ slightly from the one I linked to.
NOTE for Kitkat and later: The UCM settings files were used on Jellybean (and possibly ICS). My understanding is that these settings have been moved to a file named mixer_paths.xml on Kitkat. The contents are pretty much the same, just in a different format.

The audio HAL code. The ALSA UCM is present in libalsa-intf, and the AudioHardware / AudioPolicyManager / ALSADevice code is present in audio-alsa. Note that this code is for Jellybean, since that's the lastest version that I'm familiar with. The directory structure (and possibly some of the files / classes) differs on Kitkat.

If you open up the UCM settings file and search for "HiFiPROXY Rx" you'll find something like this:

SectionVerb
Name "HiFiPROXY Rx"

EnableSequence
    'AFE_PCM_RX Audio Mixer MultiMedia1':1:1
EndSequence

DisableSequence
    'AFE_PCM_RX Audio Mixer MultiMedia1':1:0
EndSequence

# ALSA PCMs
CapturePCM 0
PlaybackPCM 0
EndSection

This defines a verb (essentially the basis of an audio use-case; there are also modifiers that can be applied on top of verbs for stuff like simultaneous playback and recording) with the name "HiFiPROXY Rx" (the HiFi moniker is used for most non-voice-call verbs, PROXY refers to the audio device used, and Rx means output) and specifies which ALSA control(s) to write to, and what to write to them, when the use-case should be enabled / disabled. Finally it lists the ALSA PCM playback / capture devices to use in this use-case. For example, PlaybackPCM 0 means that playback device 0 should be used (the ALSA card is implied to be the one that represents the built-in hardware codec, which typically is card 0). These verbs are selected by the audio HAL based on the use-case (music playback, voice call, recording, ...), what accessories you've got attached, etc.


If you look up "AFE_PCM_RX Audio Mixer" in the msm_qdsp6_widgets table in msm-pcm-routing.c you'll see that it refers to a list of mixer controls named afe_pcm_rx_mixer_controls that looks like this:

static const struct snd_kcontrol_new afe_pcm_rx_mixer_controls[] = {
    SOC_SINGLE_EXT("MultiMedia1", MSM_BACKEND_DAI_AFE_PCM_RX,
    MSM_FRONTEND_DAI_MULTIMEDIA1, 1, 0, msm_routing_get_audio_mixer,
    msm_routing_put_audio_mixer),
    SOC_SINGLE_EXT("MultiMedia2", MSM_BACKEND_DAI_AFE_PCM_RX,
    ... and so on...

This lists the front end DAIs that you are allowed to connect to the back end DAI (AFE_PCM_RX). To get an idea of how these relate to one another, see these diagrams.
AFE_PCM_RX and AFE_PCM_TX is a pair of DAIs on some of Qualcomm's platforms that implement a sort of dummy/proxy device. What you do is feed audio into AFE_PCM_RX which then gets processed by the multimedia DSP (QDSP), and then you can read it back through AFE_PCM_TX. This is used to implement USB and WiFi audio routing, and also A2DP IIRC.

Back to the AFE_PCM_RX Audio Mixer MultiMedia1 line: This says that you're feeding MultiMedia1 into the AFE_PCM_RX Audio Mixer. MultiMedia1 is used for normal playback/recording, and corresponds to pcmC0D0 (you should be able to list the devices on your phone with adb shell cat /proc/asound/devices). There are other front end DAIs, like MultiMedia3 and MultiMedia5 that are used in special cases like low-latency playback and low-power audio playback.
When you feed MultiMedia1 to the AFE_PCM_RX Audio Mixer everything you write to playback device 0 on card 0 will be fed into the AFE_PCM_RX back end DAI. To read it back you could set up a UCM verb that does something like 'MultiMedia1 Mixer AFE_PCM_TX':1:1, and then you'd read from pcmC0D0c (which should be the default ALSA capture device).


A simple test would be to pull the UCM settings file from your phone (should be located somewhere under /system/etc/) and amend the "HiFi" verb's EnableSequence with something like:

'AFE_PCM_RX Audio Mixer MultiMedia1':1:1
'AFE_PCM_RX Audio Mixer MultiMedia3':1:1
'AFE_PCM_RX Audio Mixer MultiMedia5':1:1

(and similarly in the DisableSequence, but with :1:0 at the end of each line).

Then go to the "Capture Music" modifier (this is the poorly named modifier for normal recording) and change SLIM_0_TX to AFE_PCM_TX.

Copy your modified UCM settings file back to the phone (requires root permission), and reboot the phone. Then start some playback (have a wired headset/headphone attached, and disable touch sounds so that the low-latency verb doesn't get selected), and start a recording from AudioSource.MIC. Afterwards, check the recording and see if you were able to record the playback audio. If not, then perhaps the low-power audio verb was selected and you'll have to modify the "HiFi Low Power" verb similarly to what you did with the "HiFi" verb. It will help you if you have all the debug prints enabled in the audio HAL (i.e. uncomment #define LOG_NDEBUG 0 in all the cpp files where you can find it) so that you can see which UCM verbs / modifiers that get selected.


The modification I described above gets a bit tedious since you have to cover all the MultiMedia front end DAIs for all relevant verbs and modifiers.
IIRC, I was able to simplify this into just a single line per verb/modifier:

'AFE_PCM_RX Port Mixer SLIM_0_RX':1:1

If you look at the "HiFi", "HiFi Low Power", "HiFi Lowlatency" verbs you'll see that they all use the SLIMBUS_0_RX back end DAI, so I'm taking advantage of that by using the AFE_PCM_RX Port Mixer which lets me set up a connection from a back end DAI to another back end DAI. If you look at the afe_pcm_rx_port_mixer_controls and intercon tables in msm-pcm-routing.c you'll notice that there's no SLIM_0_RX entry for AFE_PCM_RX Port Mixer, so you'd have to add those yourself (it's just a matter of copy-pasting some of the existing lines and changing the names).


Some of the other changes you'd probably have to make:

  • In frameworks/base and frameworks/av (e.g. AudioManager, AudioService, AudioSystem) you'd have to add a new AudioSource constant and make sure that it gets recognized in all the necessary places.

  • In the UCM settings file you'd have to add some new verbs / modifiers to set up the ALSA controls properly when your new AudioSource is used.

  • In the audio HAL you'd have to make some changes so that your new verbs / modifiers get selected when your new AudioSource is used. Note that there's a base class of AudioPolicyManagerALSA called AudioPolicyManagerBase which you also might have to modify (it's located elsewhere in the source tree).