How to get voice in raw format by using mic in linux

madper picture madper · Mar 20, 2012 · Viewed 7.5k times · Source

I'm writting a speech recognition program with cmu sphinx. It needs get a .raw audio file to deal with. How can I get voice from my mic in raw format? I have googled for that. They say I coud read from /dev/dsp but I can't find that file/device. I'm in ArchLinux with alsa. Linux version 3.2.9-1-pae.

madper@myhost /dev % ls
agpgart             ptmx      tty23  tty58    vcs28  vcs62   vcsa39
autofs              pts/      tty24  tty59    vcs29  vcs63   vcsa4
block/              random    tty25  tty6     vcs3   vcs7    vcsa40
bsg/                rfkill    tty26  tty60    vcs30  vcs8    vcsa41
btrfs-control       rtc@      tty27  tty61    vcs31  vcs9    vcsa42
bus/                rtc0      tty28  tty62    vcs32  vcsa    vcsa43
char/               sda       tty29  tty63    vcs33  vcsa1   vcsa44
console             sda1      tty3   tty7     vcs34  vcsa10  vcsa45
core@               sda2      tty30  tty8     vcs35  vcsa11  vcsa46
cpu/                sda3      tty31  tty9     vcs36  vcsa12  vcsa47
cpu_dma_latency     sda4      tty32  ttyS0    vcs37  vcsa13  vcsa48
disk/               sda5      tty33  ttyS1    vcs38  vcsa14  vcsa49
dri/                sda6      tty34  ttyS2    vcs39  vcsa15  vcsa5
fb0                 sda7      tty35  ttyS3    vcs4   vcsa16  vcsa50
fd@                 sda8      tty36  uinput   vcs40  vcsa17  vcsa51
freefall            shm/      tty37  urandom  vcs41  vcsa18  vcsa52
full                snapshot  tty38  v4l/     vcs42  vcsa19  vcsa53
fuse                snd/      tty39  vcs      vcs43  vcsa2   vcsa54
hidraw0             stderr@   tty4   vcs1     vcs44  vcsa20  vcsa55
hidraw1             stdin@    tty40  vcs10    vcs45  vcsa21  vcsa56
hpet                stdout@   tty41  vcs11    vcs46  vcsa22  vcsa57
initctl|            tty       tty42  vcs12    vcs47  vcsa23  vcsa58
input/              tty0      tty43  vcs13    vcs48  vcsa24  vcsa59
kmsg                tty1      tty44  vcs14    vcs49  vcsa25  vcsa6
log=                tty10     tty45  vcs15    vcs5   vcsa26  vcsa60
loop-control        tty11     tty46  vcs16    vcs50  vcsa27  vcsa61
mapper/             tty12     tty47  vcs17    vcs51  vcsa28  vcsa62
mcelog              tty13     tty48  vcs18    vcs52  vcsa29  vcsa63
media0              tty14     tty49  vcs19    vcs53  vcsa3   vcsa7
mei                 tty15     tty5   vcs2     vcs54  vcsa30  vcsa8
mem                 tty16     tty50  vcs20    vcs55  vcsa31  vcsa9
net/                tty17     tty51  vcs21    vcs56  vcsa32  vga_arbiter
network_latency     tty18     tty52  vcs22    vcs57  vcsa33  video0
network_throughput  tty19     tty53  vcs23    vcs58  vcsa34  watchdog
null                tty2      tty54  vcs24    vcs59  vcsa35  zero
port                tty20     tty55  vcs25    vcs6   vcsa36
ppp                 tty21     tty56  vcs26    vcs60  vcsa37
psaux               tty22     tty57  vcs27    vcs61  vcsa38

Is there an other way to get voice? Use GStreamer? Or can I use google's api for getting the text by uploading a audio file? any other advice is also welcome. Thank you

Answer

MD Sayem Ahmed picture MD Sayem Ahmed · Mar 20, 2012

Here are some useful links that will teach you how to capture voice data using ALSA -

  1. Resource 1
  2. Linux Journal

Here is a link that will give you some insights about ALSA and its configuration.

This is the official ALSA API Reference.

This may be out of context but here is a list of recommendations that you should keep in mind while doing audio programming.

If you would like some alternatives to ALSA then I would suggest to take a look at Port Audio.