TÉLÉCHARGEMENT

https://sourceforge.net/projects/sox/files/sox/

GÉNÉRALITÉS

sox ops-globales [format-ops] fich_in [[format-ops] infile2] ... [format-ops] outfile [effect [effect-ops]] ...
play ops-globales [format-options] infile1 [[format-options] infile2] ... [format-options] [effect [effect-options]] ...
rec [global-options] [format-options] outfile [effect [effect-options]] ...

MONO <> STEREO

http://artfab.art.cmu.edu/stereo-mono-conversion-with-sox

Mettre chaque canal dans deux fichiers mono (gauche et droite):

sox fich.wav fich.g.wav remix 1
sox fich.wav fich.d.wav remix 2

Créer un fichier mono où les deux canaux (G et D) sont mixés :

sox fich.wav fich-out.wav remix 1,2
sox fich.wav outfile.wav remix 1-2

CONVERSION

Convertion du format au au format ~~wav~~ :

sox recital.au recital.wav

Même convertion, mais applique 4 effets (mono, sample rate, fade-in, nomalize) et enregistre en 16 bits:

sox recital.au -b 16 recital.wav channels 1 rate 16k fade 3 norm

Taux d'échantillonnage à 48kHz:

sox infile.wav -r 48k outfile.wav

convertir en ~~raw~~ :

 sox -r 16k -e signed -b 8 -c 1 voice-memo.raw voice-memo.wav

ASSEMBLER PLUSIEURS FICHIERS

sox fich1.wav fich2.wav fichier.wav
sox -m fich1.mp3 fich2.wav fichier.flac

CRÉATION DE SONS

Jouer un accord de ~~La mineur septième~~ avec un son d'orgue d'église :

play -n -c1 synth sin %-12 sin %-9 sin %-5 sin %-2 fade h 0.1 1 0.1

ENREGISTREMENT

rec new-file.wav

enregistrer 30min en stéréo :

rec -c 2 radio.aiff trim 0 30:00

enregistre un flux et scinde en fichiers at points avec 2 secondes de silence. Aussi, n'enregistre pas tant qu'un son n'est pas détécté, puis arrête après 10min de silence :

rec -r 44100 -b 16 -s -p silence 1 0.50 0.1% 1 10:00 0.1% | \
sox -p song.ogg silence 1 0.50 0.1% 1 2.0 0.1% : \
newfile : restart

LECTURE

play existing-file.wav
play -q take1.aiff & rec -M take1.aiff take1-dub.aiff

File Format Types

~~sample rate~~ : samples par seconde. (44.1 kHz).

~~sample size~~ : nombre de bits par sample. (16-bit).

~~channels~~ : one (mono) et two (stereo).

VOLUME

Baisser de 6dB :

sox fich.wav fichout.wav gain -6

Input File Combining

-V affiche le volume du fichier d'entrée.

Using the norm effect on the mix is another alternative.

TRIM (rogner)

enlever les 60 premières secondes en deux fihiers de 30s et supprimer le reste :

sox song.wav ringtone%1n.wav trim 0 30 : newfile : trim 0 30

SORTIE AUDIO

Indiquer la sortie audio

sox ... -t oss
sox ... -t alsa

noms des fichiers de sortie

- si utilisé en entrée, permet de lire l'entrée standard. Si utilisé en sortie, alors utilise la sortie standard.

~~play --rate 6k *.vox~~ pour appliquer l'effet à tous les fichiers ~~.vox~~

redirection

-p (remplace ~~-t sox -~~) pour envoyer le retour d'un sox dans un autre sox: exemple, jouer deux sons succesifs :

play "|sox -n -p synth 2" "|sox -n -p synth 2 tremolo 10" stat

-n pour un fichier null (silence infini), utile pour spécifier un silence d'un certaine durée (trim ou synth).

OPTIONS

--combine concatenate|merge|mix|mix-power|multiply|sequence

-m selects `mix',

-M selects `merge'

-T selects `multiply'

-G Automatically invoke the gain effect to guard against clipping.

~~--help-effect NOM~~ infos sur l'effet. si ~~NOM~~ est all, montre usage de tous les effets.

~~--help-format NOM~~ infos sur le format. Si ~~NOM~~ est all, montre infos des formats.

~~--no-clobber~~ demande confirmation avant d'écraser un fichier.

~~--norm[=dB-level]~~ normaliser le son.

sox --norm fich fichout

Pour normaliser à -3dB :

sox --norm=-3 fich fichout

-q mode silencieux. L'inverse de -S

~~--replay-gain track|album|off~~ Select whether or not to apply replay-gain adjustment to input files.

-S Affiche la progression du traitement, ainsi qu'un VU-mètre. Par défaut si play ou rec.

-V niveau de bavardage de retour. ~~-V0~~ muet jusqu'à ~~-V4~~.

Input File Options

These options apply only to input files and may precede only input filenames on the command line.

~~-v, --volume FACTOR~~ Intended for use when combining multiple input files, this option adjusts the volume of the file that follows it on the command line by a factor of FACTOR. This allows it to be `balanced' w.r.t. the other input files. This is a linear (amplitude) adjustment, so a number less than 1 decreases the volume and a number greater than 1 increases it. If a negative number is given then in addition to the volume adjustment, the audio signal will be inverted.See also the norm, vol, and gain effects, and see Input File Balancing above.

Input & Output File Format Options

~~-b BITS~~ nombre de bits

sox -b 8 fich-in fich_out

~~-r RATE[k]~~ Gives the sample rate in Hz () of the file.

sox in.wav -r 48k out.wav
sox in.wav out.wav rate 48k

EFFECTS

gain the audio is normalised to a given level below 0dB. For example,normalises to 0dB, and

sox infile outfile gain -n

normalises to -3dB

sox infile outfile gain -n -3

-l option invokes a simple limiter :

sox infile outfile gain -l 6

will apply 6dB of gain but never clip.

SILENCE

silence [-l] above-periods [duration threshold[d|%] [below-periods duration threshold[d|%]]

Removes silence from the beginning, middle, or end of the audio. `Silence' is determined by a specified threshold.

The above-periods value is used to indicate if audio should be trimmed at the beginning of the audio. A value of zero indicates no silence should be trimmed from the beginning. When specifying an non-zero above-periods, it trims audio up until it finds non-silence. Normally, when trimming silence from beginning of audio the above-periods will be 1 but it can be increased to higher values to trim all audio up to a specific count of non-silence periods. For example, if you had an audio file with two songs that each contained 2 seconds of silence before the song, you could specify an above-period of 2 to strip out both silence periods and the first song.

When above-periods is non-zero, you must also specify a duration and threshold. Duration indications the amount of time that non-silence must be detected before it stops trimming audio. By increasing the duration, burst of noise can be treated as silence and trimmed off.

Threshold is used to indicate what sample value you should treat as silence. For digital audio, a value of 0 may be fine but for audio recorded from analog, you may wish to increase the value to account for background noise.

When optionally trimming silence from the end of the audio, you specify a below-periods count. In this case, below-period means to remove all audio after silence is detected. Normally, this will be a value 1 of but it can be increased to skip over periods of silence that are wanted. For example, if you have a song with 2 seconds of silence in the middle and 2 second at the end, you could set below-period to a value of 2 to skip over the silence in the middle of the audio.

For below-periods, duration specifies a period of silence that must exist before audio is not copied any more. By specifying a higher duration, silence that is wanted can be left in the audio. For example, if you have a song with an expected 1 second of silence in the middle and 2 seconds of silence at the end, a duration of 2 seconds could be used to skip over the middle silence.

Unfortunately, you must know the length of the silence at the end of your audio file to trim off silence reliably. A work around is to use the silence effect in combination with the reverse effect. By first reversing the audio, you can use the above-periods to reliably trim all audio from what looks like the front of the file. Then reverse the file again to get back to normal.

To remove silence from the middle of a file, specify a below-periods that is negative. This value is then treated as a positive value and is also used to indicate the effect should restart processing as specified by the above-periods, making it suitable for removing periods of silence in the middle of the audio.

The option -l indicates that below-periods duration length of audio should be left intact at the beginning of each period of silence. For example, if you want to remove long pauses between words but do not want to remove the pauses completely.

The period counts are in units of samples. Duration counts may be in the format of hh:mm:ss.frac, or the exact count of samples. Threshold numbers may be suffixed with d to indicate the value is in decibels, or % to indicate a percentage of maximum value of the sample value (0% specifies pure digital silence).

The following example shows how this effect can be used to start a recording that does not contain the delay at the start which usually occurs between `pressing the record button' and the start of the performance:

rec parameters filename other-effects silence 1 5 2%

DOCUMENTATIONS

soxi, soxformat, libsox, gnuplot, octave.
http://sox.sourceforge.net/Docs/Documentation
http://sox.sourceforge.net/Docs/Scripts

BITRATE

sox --i fichier.flac
Input File     : 'fichier.flac'
Channels       : 4
Sample Rate    : 44100
Precision      : 16-bit
Duration       : 00:00:12.33 = 526579 samples = 860.849 CDDA sectors
File Size      : 3.26M
Bit Rate       : 1.36M
Sample Encoding: 16-bit FLAC
Comment        : 'Comment=Processed by SoX'

sox --i -r <filename>
44100

---

# Trim a fragment of 30 seconds at an offset of 60 seconds

# with the 'trim' effect

sox input.mp3 output.wav trim 60 30

Decode to WAV (from wide variety of formats) with MPlayer

MPlayer is a media player that supports a wide range of multimedia formats. It is typically used for playing video with a GUI, but can also be used (in batch mode without a GUI) to convert the audio to WAV format. MPlayer is available for Linux (package "mplayer"), Windows and Mac OS X.

The invocation bit more complex than with the other decoders shown here. For clarity, the command is spread out over several lines here (do not forget to remove the backslashes when you want it on one line):

# Decode the audio channel to PCM (WAV) and ignore the video channels

mplayer -ao pcm:fast:waveheader:file=output.wav \
-vo null -vc null input.mp3

Use additional audio filters (-af) to resample to 22050 Hz and mix down to mono.

mplayer \
-ao pcm:fast:waveheader:file=output.wav \
-af resample=22050,pan=1:0.5:0.5 \
-vo null -vc null \
input.mp3

By default, one expects 16 bits per sample. On some setups however, MPlayer uses 32 bits per sample by default. To avoid this, set the format explicitly with:

-format s16le

Pick the 30 seconds fragment at an offset of 1 minute:

mplayer -ao pcm:fast:waveheader:file=output.wav \
-vo null -vc null -ss 60 -endpos 30 input.mp3

Note: on some platforms I had to add the option -format s16le to make sure MPlayer encoded 16 bit PCM samples instead of 24 bit or even 32 bit, which can cause problems with some audio players/tools.

Transcode with FFmpeg (from and to a wide variety of formats)

FFmpeg is another powerful open source tool for multimedia handling like conversion/transcoding. Installing is easy with a sufficient recent Linux distribution, install the "ffmpeg" package (note: on Ubuntu 9.10 aka Karmic Koala, I also had to install "libavcodec-unstripped-52", to make MP3 encoding possible, your mileage may vary). Getting it working on Windows apparently requires you to compile it yourself (or trusting a website that provides binaries). For Mac OS X, I installed the "ffmpeg" package through MacPorts, and there is also one for Fink.

FFmpeg is typically used for video, but audio transcoding works too and is pretty simple:

Minimal example: transcode from MP3 to WMA

ffmpeg -i input.mp3 output.wma

You can get the list of supported formats with:

ffmpeg -formats

Convert WAV to MP3, mix down to mono (use 1 audio channel), bitrate à 64 kbps et sample rate to 22050 Hz

ffmpeg -i input.wav -ac 1 -ab 64000 -ar 22050 output.mp3

Picking the 30 seconds fragment at an offset of 1 minute:

In seconds

ffmpeg -i input.mp3 -ss 60 -t 30 output.wav

# In HH:MM:SS format

ffmpeg -i input.mp3 -ss 0:01:00 -t 0:00:30 output.wav

Encode as MP3 or re-encode an MP3 file to a different bit rate with Lame

Lame is a well known open source MP3 encoder. Installing on Linux should be easy: just look for the "lame" package. For Mac OS X, you can use the "lame" package of MacPorts or Fink. For Windows you have to compile it yourself, or trust some websites that provide binaries.

You can use it for example to encode from WAV format to MP3 or to re-encode an MP3 to a different bit rate. Some examples:

# Minimal example of converting a wave file to MP3

lame input.wav output.mp3

# Re-encode existing MP3 to 64 kbps MP3

lame -b 64 original.mp3 new.mp3

# More interesting options

# -m m: save as mono

# -m s: save as stereo

# -m j: save as joint stereo (exploits inter-channel correlation

# more than regular stereo)

# -q 2: quality tweaking: the lower the value, the better the

# quality, but the slower the algorithm. Default is 5.

# By default, lame uses constant bit rate (CBR) encoding.

# You can also use average bit rate (ABR) encoding,

# e.g. for an average bit rate of 123 kbps:

lame --abr 123 input.wav output.mp3

# or variable (VBR) encoding, e.g. between 32 kbps and 192 kbps:

lame -v -b 32 -B 192 input.wav output.mp3

Encode in Ogg Vorbis format

With the "oggenc" tool you can encode audio in WAV format (or raw or AIFF) to Ogg Vorbis format. On Ubuntu I had to install the "vorbis-tools" package to get "oggenc".

# Minimal example

oggenc audio.wav -o audio.ogg

# Setting the bit rate, downmix to mono and set the sample rate:

oggenc -b 32 --downmix --resample 22050 input.wav -o output.ogg

Getting information about audio files

To get basic information about an audio file (like the number of channels, sample rate, duration, etc), there is the 'soxi' tool, which is part of the sox package:

soxi file.mp3

which returns something like:

Input File : 'file.mp3'
Channels : 2
Sample Rate : 44100
Precision : 16-bit
Duration : 00:03:55.35 = 10378847 samples = 17651.1 CDDA sectors
File Size : 1.88M
Bit Rate : 64.0k
Sample Encoding: MPEG audio (layer I, II or III)

You can easily specify multiple files too.

When soxi is not available (e.g. it isn't on Ubuntu 8.04) or when soxi does not recognize the file format, there are some alternatives based on FFmpeg and MPlayer.

With FFmpeg, just don't specify an output file, for example:

ffmpeg -i file.mp3

which returns something like:

... [version information] ...
Input #0, mp3, from 'file.mp3':
Duration: 00:03:55.2, start: 0.000000, bitrate: 63 kb/s
Stream #0.0: Audio: mp2, 44100 Hz, stereo, 64 kb/s
Must supply at least one output file
You can supply several files, but you need to put the flag -i in front of each one.
With MPlayer, it's a bit more involved:
mplayer -vo null -ao null -frames 0 -identify file.mp3
which returns something like:
... [version information] ...
Playing file.mp3.
ID_AUDIO_ID=0
Audio file file format detected.
ID_FILENAME=file.mp3
ID_DEMUXER=audio
ID_AUDIO_FORMAT=80
ID_AUDIO_BITRATE=64000
ID_AUDIO_RATE=44100
ID_AUDIO_NCH=0
ID_LENGTH=235.00
========================================================
Forced audio codec: mad
Opening audio decoder: [libmad] libmad mpeg audio decoder
AUDIO: 44100 Hz, 2 ch, s16le, 64.0 kbit/4.54% (ratio: 8000->176400)
ID_AUDIO_BITRATE=64000
ID_AUDIO_RATE=44100
ID_AUDIO_NCH=2
Selected audio codec: [mad] afm: libmad (libMAD MPEG layer 1-2-3)
========================================================
AO: [null] 44100Hz 2ch s16le (2 bytes per sample)
ID_AUDIO_CODEC=mad
Video: no video
Starting playback...
Exiting... (End of file)
Further reading

15 Awesome Examples to Manipulate Audio Files Using Sound eXchange (SoX)

This article is part of the on-going Software for Geeks series. SoX stands for Sound eXchange. SoX is a cross-platform command line audio utility tool that works on Linux, Windows and MacOS. It is very helpful in the following areas while dealing with audio and music files.

Audio File Converter
Editing audio files
Changing audio attributes
Adding audio effects
Plus lot of advanced sound manipulation features

In general, audio data is described by following four characteristics:

Rate – The sample rate is in samples per second. For example, 44100/8000
Data size – The precision the data is stored in. For example, 8/16 bits
Data encoding – What encoding the data type uses. For example, u-law,a-law
Channels – How many channels are contained in the audio data. For example, Stereo 2 channels

1. Combine Multiple Audio Files to Single File

-m, sox adds two input files together to produce its output. The example below adds first_part.wav and second_part.wav leaving the result in whole_part.wav. You can also use soxmix command for this purpose.

sox -m first_part.wav second_part.wav whole_part.wav
soxmix first_part.wav second_part.wav whole_part.wav

2. Extract Part of the Audio File

Trim can trim off unwanted audio from the audio file.

sox old.wav new.wav trim [SECOND TO START] [SECONDS DURATION].

SECOND TO START – Starting point in the voice file.
SECONDS DURATION – Duration of voice file to remove.

The command below will extract first 10 seconds from input.wav and stored it in output.wav

sox input.wav output.wav trim 0 10

3. monter / baisser le volume

-v pour modifier le volume.

sox -v 2.0 foo.wav bar.wav # monter le volume

(-0.5) will be louder than (-0.1)

sox -v -0.5 in.wav out.wav # baisser le volume
sox -v -0.1 in.wav out.wav # baisser encore plus

4. Get Audio File Information

stat can provide lot of statistical information about a given audio file. The -e flag tells sox not to generate any output other than the statistical information.

sox in.wav -e stat
Samples read: 3528000
Length (seconds): 40.000000
Scaled by: 2147483647.0
Maximum amplitude: 0.999969
Minimum amplitude: -1.000000
Midline amplitude: -0.000015
Mean norm: 0.217511
Mean amplitude: 0.003408
RMS amplitude: 0.283895
Maximum delta: 1.478455
Minimum delta: 0.000000
Mean delta: 0.115616
RMS delta: 0.161088
Rough frequency: 3982
Volume adjustment: 1.000

5. Play an Audio Song

Playing a sound file is accomplished by copying the file to the device special file /dev/dsp. The following command plays the file music.wav: Option -t specifies the type of the file /dev/dsp.

sox music.wav -t ossdsp /dev/dsp

You can also use play command to play the audio file as shown below.

play options Filename audio_effects
play -r 8000 -w music.wav

6. Play an Audio Song Backwards

reverse to reverse the sound in a sound file.

sox input.wav output.wav reverse

You can also use play command to hear the song in reverse without modifying the source file as shown below.

play test.wav reverse

7. Record a Voice File

‘play’ and ‘rec’ commands are companion commands for sox . /dev/dsp is the digital sampling and digital recording device. Reading the device activates the A/D converter for sound recording and analysis. /dev/dsp file works for both playing and recording sound samples.

sox -t ossdsp /dev/dsp test.wav

You can also use rec command for recording voice. If SoX is invoked as ‘rec’ the default sound device is used as an input source.

rec -r 8000 -c 1 record_voice.wav

8. Changing the Sampling Rate of a Sound File

To change the sampling rate of a sound file, use option -r followed by the sample rate to use, in Hertz. Use the following example, to change the sampling rate of file ‘old.wav’ to 16000 Hz, and write the output to ‘new.wav’

sox old.wav -r 16000 new.wav

9. Changing the Sampling Size of a Sound File

If we increase the sampling size , we will get better quality. Sample Size for audio is most often expressed as 8 bits or 16 bits. 8bit audio is more often used for voice recording.

-b Sample data size in bytes
-w Sample data size in words
-l Sample data size in long words
-d Sample data size in double long words

The following example will convert 8-bit audio file to 16-bit audio file.

sox -b input.wav -w output.wav

10. Changing the Number of Channels

The following example converts mono audio files to stereo. Use Option -c to specify the number of channels .

sox mono.wav -c 2 stereo.wav

There are methods to convert stereo sound files to mono sound. i.e to get a single channel from stereo file.

Selecting a Particular Channel

This is done by using the avg effect with an option indicating what channel to use. The options are -l for left, -r for right, -f for front, and -b for back. Following example will extract the left channel

sox stereo.wav -c 1 mono.wav avg -l

Average the Channels

sox stereo.wav -c 1 mono.wav avg

11. Audio Converter – Music File Format Conversion

Sox is useful to convert one audio format to another. i.e from one encoding (ALAW, MP3) to another. Sox can recognize the input and desired output formats by parsing the file name extensions . It will take infile.ulaw and creates a GSM encoded file called outfile.gsm. You can also use sox to convert wav to mp3.

sox infile.ulaw outfile.gsm

If the file doesn’t have an extension in its name , using ‘-t’ option we can express our intention . Option -t is used to specify the encoding type .

sox -t ulaw infile -t gsm outfile

12. Generate Different Types of Sounds

Using synth effect we can generate a number of standard wave forms and types of noise. Though this effect is used to generate audio, an input file must still be given, ‘-n’ option is used to specify the input file as null file .

sox -n synth len type freq

len – length of audio to synthesize. Format for specifying lengths in time is hh:mm:ss.frac
type is one of sine, square, triangle, sawtooth, trapezium, exp, [white]noise, pinknoise, brown-
noise. Default is sine
freq – frequencies at the beginning/end of synthesis in Hz

The following example produces a 3 second 8000 kHz, audio file containing a sine-wave swept from 300 to 3300 Hz

sox -r 8000 -n output.au synth 3 sine 300-3300

13. Speed up the Sound in an Audio File

To speed up or slow down the sound of a file, use speed to modify the pitch and the duration of the file. This raises the speed and reduces the time. The default factor is 1.0 which makes no change to the audio. 2.0 doubles speed, thus time length is cut by a half and pitch is one interval higher.

sox input.wav output.wav speed factor
sox input.wav output.wav speed 2.0

14. Multiple Changes to Audio File in Single Command

By default, SoX attempts to write audio data using the same data type, sample rate and channel count as per the input data. If the user wants the output file to be of a different format then user has to specify format options. If an output file format doesn’t support the same data type, sample rate, or channel count as the given input file format, then SoX will automatically select the closest values which it supports.

Converting a wav to raw. Following example convert sampling rate , sampling size , channel in single command line .

sox -r 8000 -w -c 1 -t wav source -r 16000 -b -c 2 -t raw destination

15. Convert Raw Audio File to MP3 Music File

There is no way to directly convert raw to mp3 file because mp3 will require compression information from raw file . First we need to convert raw to wav. And then convert wav to mp3. In the exampe below, option -h indicates high quality.

Convert Raw Format to Wav Format:

sox -w -c 2 -r 8000 audio1.raw audio1.wav

Conver Wav Format to MP3 Format:

lame -h audio1.wav audio1.mp3

Need some help. I am trying to execute the soxmix command from a php call, but it does nothing… can you show me how to do it?

$output = array();
$result = -1;
exec(‘/usr/bin/soxmix test/exports/audiotest1.wav audio/003-half.wav test/exports/mixedaudio.wav’, $output, $result);
var_dump($output, $result);

Your instructions for decreasing the volume are incorrect. Positive numbers less the 1 decrease the volume. You can make those numbers negative, but that will just invert the phase; it has no effect on volume. I would suggest you replace your examples with these:

sox -v 0.5 srcfile.wav test05.wav
sox -v 0.1 srcfile.wav test01.wav

The first command reduces the gain by 50%, the second by 90%.

sox YourInputFilename -n stats

-n means “null output”, in other words, just provide the stats and don’t make a new audio file.

If you cannot convert mp3 files, install this package:

sudo apt-get install libsox-fmt-mp3

trim only parts over a certain level ?

sox -V -S -b32 inputfile.mp3 outputfile.wv norm -3 rate -vMa 28224k rate -vMa 192k

simple example for upsample and harmonize, allow aliasing for higher harmonics growing, distortion of aliasing phenomena is reduced trought very high intermodulate rate and this technique produce best sound coloring what i ever hear

SoX > modifier des fichiers sons en ligne de commande