Audio streaming via TCP socket on Android

tritop picture tritop · Jun 3, 2014 · Viewed 9.4k times · Source

I am streaming mic input from a C Server via socket. I know the stream works because it does with a C client and I am getting the right values on my Android client.

I am streaming a 1024 floatarray. One float are 4 bytes. So I got a incoming stream with 4096 bytes per frame. I am getting the floats out of this bytes and I know this floats are the ones I sent, so that part should work.

Now I want to get that stream directly to the phones speakers by using AudioTrack. I tried to input the bytes I received directly: just noise. I tried to cast it back to a byte array, still the same. I tried to cast that float into short (because AudioTrack takes bytes or short). I could get something that could have been my mic input (knocking), but very scratchy and and extremely laggy. I would understand if there was a lag between the frames, but I can't even get one clear sound. I can, however, output a sin sound clearly that I produce locally and put into that shortarray. Now I wonder if I got some issues in my code anyone of you can see, because I don't see them.

What I am doing is: I put 4 bytes in a byte array. I get the float out of it. As soon as I got one Frame in my float array (I am controlling that with a bool, not nice, but it should work) I put it in my shortarray and let audiotrack play it. This double casting might be slow, but I do it because its the closest I got to playing the actual input.

Edit: I checked the endianess by comparing the floats, they have the proper values between -1 and 1 and are the same ones I send. Since I don't change the endianess when casting to float, I don't get why forwarding a 4096 byte array to AudioTrack directly doesn't work neither. There might be something wrong with the multithreading, but I don't see what it could be.

Edit 2: I discovered a minor problem - I reset j at 1023. But that missing float should not have been the problem. What I did other than that was to put the method that took the stream from the socket in another thread instead of calling it in a async task. That made it work, I now am able to understand the mic sounds. Still the quality is very poor - might there be a reason for that in the code? Also I got a delay of about 10 seconds. Only about half a second is caused by WLAN, so I wonder if it might be the codes fault. Any further thoughts are appreciated.

Edit 3: I played around with the code and implemented a few of greenapps ideas in the comments. With the new thread structure I was facing the problem of not getting any sound. Like at all. I don't get how that is even possible, so I switched back. Other things I tried to make the threads more lightweight didn't have any effect. I got a delay and I got a very poor quality (I can identify knocks, but I can't understand voices). I figured something might be wrong with my convertions, so I put the bytes I receive from the socket directly in AudioTrack - nothing but ugly pulsing static noise. Now I am even more confused, since this exact stream still works with the C client. I will report back if I find a solution, but still any help is welcome.

Edit 4 I should add, that I can play mic inputs from another android app where I send that input directly as bytes (I would exclude the float casting stuff and put the bytes I receive directly to audioTrack in my player code).
Also it occured to me, that it could be a problem, that the said floatarray that is streamed by the C Server comes from a 64bit machine while the phone is 32bit. Could that be a problem somehow, even though I am just streaming floats as 4 bytes? Or, another thought of mine: The underlying number format of the bytes I receive is float. What format does AudioTrack expect? Even if put in just bytes - would I need to cast that float to a int and cast that back to bytes or something?

new code:

public class PCMSocket {

AudioTrack audioTrack;
boolean doStop = false;
int musicLength = 4096;
byte[] music;
Socket socket;
short[] buffer = new short[4096];
float[] fmusic = new float[1024];
WriteToAudio writeThread;
ReadFromSocket readThread;


public PCMSocket()
{

}

public void start()
{
    doStop = false;
    readThread = new ReadFromSocket();
    readThread.start();
}

public class ReadFromSocket extends Thread
{       
    public void run()
    {
    doStop=true;

    InetSocketAddress address = new InetSocketAddress("xxx.xxx.xxx.x", 8000);

    socket = new Socket();
    int timeout = 6000;   
    try {
        socket.connect(address, timeout);
    } catch (IOException e2) {
        e2.printStackTrace();
    }

     musicLength = 1024;

    InputStream is = null;

    try {
        is = socket.getInputStream();
    } catch (IOException e) {
        e.printStackTrace();
    }

    BufferedInputStream bis = new BufferedInputStream(is);
    DataInputStream dis = new DataInputStream(bis);     

    try{

    int minSize =AudioTrack.getMinBufferSize( 44100, AudioFormat.CHANNEL_CONFIGURATION_STEREO, AudioFormat.ENCODING_PCM_16BIT ); 

    audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC, 44100,
            AudioFormat.CHANNEL_OUT_STEREO, 
            AudioFormat.ENCODING_PCM_16BIT, minSize,
            AudioTrack.MODE_STREAM);
        audioTrack.play();

      } catch (Throwable t)
      {
          t.printStackTrace();
        doStop = true;
      }

    writeThread = new WriteToAudio();
    readThread.start();

    int i = 0;   
    int j=0;

    try {
        if(dis.available()>0)Log.d("PCMSocket", "receiving");
        music = new byte[4];
        while (dis.available() > 0)
        {
            music[i]=0;
            music[i] = dis.readByte(); 

            if(i==3)
            {
                int asInt = 0;
                asInt = ((music[0] & 0xFF) << 0) 
                        | ((music[1] & 0xFF) << 8) 
                        | ((music[2] & 0xFF) << 16) 
                        | ((music[3] & 0xFF) << 24);
                float asFloat = 0;
                asFloat = Float.intBitsToFloat(asInt);
                fmusic[j]=asFloat;
            }

            i++;
            j++;
            if(i==4)
            {
                music = new byte[4]; 
                i=0;
            }
            if(j==1024)
            {
                j=0;
                if(doStop)doStop=false;
            }
        }
    } catch (IOException e) {
        e.printStackTrace();
    }

    try {
        dis.close();
    } catch (IOException e) {
        e.printStackTrace();
    }  

    }
};


public class WriteToAudio extends Thread
{       
    public void run()
    {
        while(true){
        while(!doStop)
        {           
            try{
                writeSamples(fmusic);

            }catch(Exception e)
            {
                e.printStackTrace();
            }    
            doStop = true;
        }
        }
    }
};


public void writeSamples(float[] samples) 
{   
   fillBuffer( samples );
   audioTrack.write( buffer, 0, samples.length );
}

private void fillBuffer( float[] samples )
{ 
   if( buffer.length < samples.length )
      buffer = new short[samples.length];

   for( int i = 0; i < samples.length; i++ )
   {
      buffer[i] = (short)(samples[i] * Short.MAX_VALUE);
   }
}   


}

old code:

public class PCMSocket {
AudioTrack audioTrack;
WriteToAudio thread;
boolean doStop = false;
int musicLength = 4096;
byte[] music;
Socket socket;
short[] buffer = new short[4096];
float[] fmusic = new float[1024];


public PCMSocket()
{

}

public void start()
{
    doStop = false;
    new GetStream().executeOnExecutor(AsyncTask.THREAD_POOL_EXECUTOR);
}

private class GetStream extends AsyncTask<Void, Void, Void> {

    @Override
    protected Void doInBackground(Void... values) { 
        PCMSocket.this.getSocket();
        return null;

    }

    @Override
    protected void onPreExecute() {
    }



    @Override
    protected void onPostExecute(Void result)
    {
        return;
    }

    @Override
    protected void onProgressUpdate(Void... values) {
    }
}

private void getSocket()
{
    doStop=true;

    InetSocketAddress address = new InetSocketAddress("xxx.xxx.xxx.x", 8000);

    socket = new Socket();
    int timeout = 6000;   
    try {
        socket.connect(address, timeout);
    } catch (IOException e2) {
        e2.printStackTrace();
    }

     musicLength = 1024;

    InputStream is = null;

    try {
        is = socket.getInputStream();
    } catch (IOException e) {
        e.printStackTrace();
    }

    BufferedInputStream bis = new BufferedInputStream(is);
    DataInputStream dis = new DataInputStream(bis);     

    try{

    int minSize =AudioTrack.getMinBufferSize( 44100, AudioFormat.CHANNEL_CONFIGURATION_STEREO, AudioFormat.ENCODING_PCM_16BIT ); 

    audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC, 44100,
            AudioFormat.CHANNEL_OUT_STEREO, 
            AudioFormat.ENCODING_PCM_16BIT, minSize,
            AudioTrack.MODE_STREAM);
        audioTrack.play();

      } catch (Throwable t)
      {
          t.printStackTrace();
        doStop = true;
      }

    thread = new WriteToAudio();
    thread.start();

    int i = 0;   
    int j=0;

    try {
        if(dis.available()>0)Log.d("PCMSocket", "receiving");
        music = new byte[4];
        while (dis.available() > 0)
        {
            music[i]=0;
            music[i] = dis.readByte(); 

            if(i==3)
            {
                int asInt = 0;
                asInt = ((music[0] & 0xFF) << 0) 
                        | ((music[1] & 0xFF) << 8) 
                        | ((music[2] & 0xFF) << 16) 
                        | ((music[3] & 0xFF) << 24);
                float asFloat = 0;
                asFloat = Float.intBitsToFloat(asInt);
                fmusic[j]=asFloat;
            }

            i++;
            j++;
            if(i==4)
            {
                music = new byte[4]; 
                i=0;
            }
            if(j==1023)
            {
                j=0;
                if(doStop)doStop=false;
            }
        }
    } catch (IOException e) {
        e.printStackTrace();
    }

    try {
        dis.close();
    } catch (IOException e) {
        e.printStackTrace();
    }  

}


public class WriteToAudio extends Thread
{       
    public void run()
    {
        while(true){
        while(!doStop)
        {           
            try{
                writeSamples(fmusic);

            }catch(Exception e)
            {
                e.printStackTrace();
            }    
            doStop = true;
        }
        }
    }
};


public void writeSamples(float[] samples) 
{   
   fillBuffer( samples );
   audioTrack.write( buffer, 0, samples.length );
}

private void fillBuffer( float[] samples )
{ 
   if( buffer.length < samples.length )
      buffer = new short[samples.length*4];

   for( int i = 0; i < samples.length; i++ )
   {
      buffer[i] = (short)(samples[i] * Short.MAX_VALUE);
   }
}   


}

Answer

tritop picture tritop · Jun 5, 2014

Sooo...I just solved this only hours after I desperatly put bounty on it, but thats worth it.

I decided to start over. For the design thing with threads etc. I took some help from this awesome project, it helped me a lot. Now I use only one thread. It seems like the main point was the casting stuff, but I am not too sure, it also may have been the multithreading. I don't know what kind of bytes the byte[] constructor of AudioTracker expects, but certainly no float bytes. So I knew I need to use the short[] constructor. What I did was
-put the bytes in a byte[]
-take 4 of them and cast them to a float in a loop
-take each float and cast them to shorts

Since I already did that before, I am not too sure what the problem was. But now it works. I hope this can help someone who wents trough the same pain as me. Big thanks to all of you who participated and commented.

Edit: I just thought about the changes and figured that me using CHANNEL_CONFIGURATION_STEREO instead of MONO earlier has contributed a lot to the stuttering. So you might want to try that one first if you encounter this problem. Still for me it was only a part of the solution, changing just that didn't help.

    static final int frequency = 44100;
    static final int channelConfiguration = AudioFormat.CHANNEL_CONFIGURATION_MONO;
    static final int audioEncoding = AudioFormat.ENCODING_PCM_16BIT;
    boolean isPlaying;
    int playBufSize;
    Socket socket;
    AudioTrack audioTrack;

    playBufSize=AudioTrack.getMinBufferSize(frequency, channelConfiguration, audioEncoding);
    audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC, frequency, channelConfiguration, audioEncoding, playBufSize, AudioTrack.MODE_STREAM);

    new Thread() {
        byte[] buffer = new byte[4096];
        public void run() {
            try { 
                socket = new Socket(ip, port); 
            }
            catch (Exception e) {
                e.printStackTrace();
            }
            audioTrack.play();
            isPlaying = true;
            while (isPlaying) {
                int readSize = 0;
                try { readSize = socket.getInputStream().read(buffer); }
                catch (Exception e) {
                    e.printStackTrace();
                }
                short[] sbuffer = new short[1024];
                for(int i = 0; i < buffer.length; i++)
                {

                    int asInt = 0;
                    asInt = ((buffer[i] & 0xFF) << 0) 
                            | ((buffer[i+1] & 0xFF) << 8) 
                            | ((buffer[i+2] & 0xFF) << 16) 
                            | ((buffer[i+3] & 0xFF) << 24);
                    float asFloat = 0;
                    asFloat = Float.intBitsToFloat(asInt);
                    int k=0;
                    try{k = i/4;}catch(Exception e){}
                    sbuffer[k] = (short)(asFloat * Short.MAX_VALUE);

                    i=i+3;
                }
                audioTrack.write(sbuffer, 0, sbuffer.length);
            }  
            audioTrack.stop();
            try { socket.close(); }
            catch (Exception e) { e.printStackTrace(); }
        }
    }.start();