TÉLÉCHARGEMENT
https://sourceforge.net/projects/sox/files/sox/
GÉNÉRALITÉS
sox ops-globales [format-ops] fich_in [[format-ops] infile2] ... [format-ops] outfile [effect [effect-ops]] ... play ops-globales [format-options] infile1 [[format-options] infile2] ... [format-options] [effect [effect-options]] ... rec [global-options] [format-options] outfile [effect [effect-options]] ...
MONO <> STEREO
http://artfab.art.cmu.edu/stereo-mono-conversion-with-sox
Mettre chaque canal dans deux fichiers mono (gauche et droite):
sox fich.wav fich.g.wav remix 1 sox fich.wav fich.d.wav remix 2
Créer un fichier mono où les deux canaux (G et D) sont mixés :
sox fich.wav fich-out.wav remix 1,2 sox fich.wav outfile.wav remix 1-2
CONVERSION
Convertion du format au au format wav :
sox recital.au recital.wav
Même convertion, mais applique 4 effets (mono, sample rate, fade-in, nomalize) et enregistre en 16 bits:
sox recital.au -b 16 recital.wav channels 1 rate 16k fade 3 norm
Taux d'échantillonnage à 48kHz:
sox infile.wav -r 48k outfile.wav
convertir en raw :
sox -r 16k -e signed -b 8 -c 1 voice-memo.raw voice-memo.wav
ASSEMBLER PLUSIEURS FICHIERS
sox fich1.wav fich2.wav fichier.wav sox -m fich1.mp3 fich2.wav fichier.flac
CRÉATION DE SONS
Jouer un accord de La mineur septième avec un son d'orgue d'église :
play -n -c1 synth sin %-12 sin %-9 sin %-5 sin %-2 fade h 0.1 1 0.1
ENREGISTREMENT
rec new-file.wav
enregistrer 30min en stéréo :
rec -c 2 radio.aiff trim 0 30:00
enregistre un flux et scinde en fichiers at points avec 2 secondes de silence. Aussi, n'enregistre pas tant qu'un son n'est pas détécté, puis arrête après 10min de silence :
rec -r 44100 -b 16 -s -p silence 1 0.50 0.1% 1 10:00 0.1% | \ sox -p song.ogg silence 1 0.50 0.1% 1 2.0 0.1% : \ newfile : restart
LECTURE
play existing-file.wav play -q take1.aiff & rec -M take1.aiff take1-dub.aiff
File Format Types
sample rate : samples par seconde. (44.1 kHz).
sample size : nombre de bits par sample. (16-bit).
channels : one (mono) et two (stereo).
VOLUME
Baisser de 6dB :
sox fich.wav fichout.wav gain -6
Input File Combining
-V affiche le volume du fichier d'entrée.
Using the norm effect on the mix is another alternative.
TRIM (rogner)
enlever les 60 premières secondes en deux fihiers de 30s et supprimer le reste :
sox song.wav ringtone%1n.wav trim 0 30 : newfile : trim 0 30
SORTIE AUDIO
Indiquer la sortie audio
sox ... -t oss sox ... -t alsa
noms des fichiers de sortie
- si utilisé en entrée, permet de lire l'entrée standard. Si utilisé en sortie, alors utilise la sortie standard.
play --rate 6k *.vox pour appliquer l'effet à tous les fichiers .vox
redirection
-p (remplace -t sox -) pour envoyer le retour d'un sox dans un autre sox: exemple, jouer deux sons succesifs :
play "|sox -n -p synth 2" "|sox -n -p synth 2 tremolo 10" stat
-n pour un fichier null (silence infini), utile pour spécifier un silence d'un certaine durée (trim ou synth).
OPTIONS
--combine concatenate|merge|mix|mix-power|multiply|sequence
-m selects `mix',
-M selects `merge'
-T selects `multiply'
-G Automatically invoke the gain effect to guard against clipping.
--help-effect NOM infos sur l'effet. si NOM est all, montre usage de tous les effets.
--help-format NOM infos sur le format. Si NOM est all, montre infos des formats.
--no-clobber demande confirmation avant d'écraser un fichier.
--norm[=dB-level] normaliser le son.
sox --norm fich fichout
Pour normaliser à -3dB :
sox --norm=-3 fich fichout
-q mode silencieux. L'inverse de -S
--replay-gain track|album|off Select whether or not to apply replay-gain adjustment to input files.
-S Affiche la progression du traitement, ainsi qu'un VU-mètre. Par défaut si play ou rec.
-V niveau de bavardage de retour. -V0 muet jusqu'à -V4.
Input File Options
These options apply only to input files and may precede only input filenames on the command line.
-v, --volume FACTOR Intended for use when combining multiple input files, this option adjusts the volume of the file that follows it on the command line by a factor of FACTOR. This allows it to be `balanced' w.r.t. the other input files. This is a linear (amplitude) adjustment, so a number less than 1 decreases the volume and a number greater than 1 increases it. If a negative number is given then in addition to the volume adjustment, the audio signal will be inverted.See also the norm, vol, and gain effects, and see Input File Balancing above.
Input & Output File Format Options
-b BITS nombre de bits
sox -b 8 fich-in fich_out
-r RATE[k] Gives the sample rate in Hz () of the file.
sox in.wav -r 48k out.wav sox in.wav out.wav rate 48k
EFFECTS
gain the audio is normalised to a given level below 0dB. For example,normalises to 0dB, and
sox infile outfile gain -n
normalises to -3dB
sox infile outfile gain -n -3
-l option invokes a simple limiter :
sox infile outfile gain -l 6
will apply 6dB of gain but never clip.
SILENCE
silence [-l] above-periods [duration threshold[d|%] [below-periods duration threshold[d|%]]
Removes silence from the beginning, middle, or end of the audio. `Silence' is determined by a specified threshold.
The above-periods value is used to indicate if audio should be trimmed at the beginning of the audio. A value of zero indicates no silence should be trimmed from the beginning. When specifying an non-zero above-periods, it trims audio up until it finds non-silence. Normally, when trimming silence from beginning of audio the above-periods will be 1 but it can be increased to higher values to trim all audio up to a specific count of non-silence periods. For example, if you had an audio file with two songs that each contained 2 seconds of silence before the song, you could specify an above-period of 2 to strip out both silence periods and the first song.
When above-periods is non-zero, you must also specify a duration and threshold. Duration indications the amount of time that non-silence must be detected before it stops trimming audio. By increasing the duration, burst of noise can be treated as silence and trimmed off.
Threshold is used to indicate what sample value you should treat as silence. For digital audio, a value of 0 may be fine but for audio recorded from analog, you may wish to increase the value to account for background noise.
When optionally trimming silence from the end of the audio, you specify a below-periods count. In this case, below-period means to remove all audio after silence is detected. Normally, this will be a value 1 of but it can be increased to skip over periods of silence that are wanted. For example, if you have a song with 2 seconds of silence in the middle and 2 second at the end, you could set below-period to a value of 2 to skip over the silence in the middle of the audio.
For below-periods, duration specifies a period of silence that must exist before audio is not copied any more. By specifying a higher duration, silence that is wanted can be left in the audio. For example, if you have a song with an expected 1 second of silence in the middle and 2 seconds of silence at the end, a duration of 2 seconds could be used to skip over the middle silence.
Unfortunately, you must know the length of the silence at the end of your audio file to trim off silence reliably. A work around is to use the silence effect in combination with the reverse effect. By first reversing the audio, you can use the above-periods to reliably trim all audio from what looks like the front of the file. Then reverse the file again to get back to normal.
To remove silence from the middle of a file, specify a below-periods that is negative. This value is then treated as a positive value and is also used to indicate the effect should restart processing as specified by the above-periods, making it suitable for removing periods of silence in the middle of the audio.
The option -l indicates that below-periods duration length of audio should be left intact at the beginning of each period of silence. For example, if you want to remove long pauses between words but do not want to remove the pauses completely.
The period counts are in units of samples. Duration counts may be in the format of hh:mm:ss.frac, or the exact count of samples. Threshold numbers may be suffixed with d to indicate the value is in decibels, or % to indicate a percentage of maximum value of the sample value (0% specifies pure digital silence).
The following example shows how this effect can be used to start a recording that does not contain the delay at the start which usually occurs between `pressing the record button' and the start of the performance:
rec parameters filename other-effects silence 1 5 2%
DOCUMENTATIONS
soxi, soxformat, libsox, gnuplot, octave.
http://sox.sourceforge.net/Docs/Documentation
http://sox.sourceforge.net/Docs/Scripts
BITRATE
sox --i fichier.flac Input File : 'fichier.flac' Channels : 4 Sample Rate : 44100 Precision : 16-bit Duration : 00:00:12.33 = 526579 samples = 860.849 CDDA sectors File Size : 3.26M Bit Rate : 1.36M Sample Encoding: 16-bit FLAC Comment : 'Comment=Processed by SoX'
sox --i -r <filename> 44100
---
# Trim a fragment of 30 seconds at an offset of 60 seconds
# with the 'trim' effect
sox input.mp3 output.wav trim 60 30
Decode to WAV (from wide variety of formats) with MPlayer
MPlayer is a media player that supports a wide range of multimedia formats. It is typically used for playing video with a GUI, but can also be used (in batch mode without a GUI) to convert the audio to WAV format. MPlayer is available for Linux (package "mplayer"), Windows and Mac OS X.
The invocation bit more complex than with the other decoders shown here. For clarity, the command is spread out over several lines here (do not forget to remove the backslashes when you want it on one line):
# Decode the audio channel to PCM (WAV) and ignore the video channels
mplayer -ao pcm:fast:waveheader:file=output.wav \ -vo null -vc null input.mp3
Use additional audio filters (-af) to resample to 22050 Hz and mix down to mono.
mplayer \ -ao pcm:fast:waveheader:file=output.wav \ -af resample=22050,pan=1:0.5:0.5 \ -vo null -vc null \ input.mp3
By default, one expects 16 bits per sample. On some setups however, MPlayer uses 32 bits per sample by default. To avoid this, set the format explicitly with:
-format s16le
Pick the 30 seconds fragment at an offset of 1 minute:
mplayer -ao pcm:fast:waveheader:file=output.wav \ -vo null -vc null -ss 60 -endpos 30 input.mp3
Note: on some platforms I had to add the option -format s16le to make sure MPlayer encoded 16 bit PCM samples instead of 24 bit or even 32 bit, which can cause problems with some audio players/tools.
Transcode with FFmpeg (from and to a wide variety of formats)
FFmpeg is another powerful open source tool for multimedia handling like conversion/transcoding. Installing is easy with a sufficient recent Linux distribution, install the "ffmpeg" package (note: on Ubuntu 9.10 aka Karmic Koala, I also had to install "libavcodec-unstripped-52", to make MP3 encoding possible, your mileage may vary). Getting it working on Windows apparently requires you to compile it yourself (or trusting a website that provides binaries). For Mac OS X, I installed the "ffmpeg" package through MacPorts, and there is also one for Fink.
FFmpeg is typically used for video, but audio transcoding works too and is pretty simple:
Minimal example: transcode from MP3 to WMA
ffmpeg -i input.mp3 output.wma
You can get the list of supported formats with:
ffmpeg -formats
Convert WAV to MP3, mix down to mono (use 1 audio channel), bitrate à 64 kbps et sample rate to 22050 Hz
ffmpeg -i input.wav -ac 1 -ab 64000 -ar 22050 output.mp3
Picking the 30 seconds fragment at an offset of 1 minute:
In seconds
ffmpeg -i input.mp3 -ss 60 -t 30 output.wav
# In HH:MM:SS format
ffmpeg -i input.mp3 -ss 0:01:00 -t 0:00:30 output.wav
Encode as MP3 or re-encode an MP3 file to a different bit rate with Lame
Lame is a well known open source MP3 encoder. Installing on Linux should be easy: just look for the "lame" package. For Mac OS X, you can use the "lame" package of MacPorts or Fink. For Windows you have to compile it yourself, or trust some websites that provide binaries.
You can use it for example to encode from WAV format to MP3 or to re-encode an MP3 to a different bit rate. Some examples:
# Minimal example of converting a wave file to MP3
lame input.wav output.mp3
# Re-encode existing MP3 to 64 kbps MP3
lame -b 64 original.mp3 new.mp3
# More interesting options
# -m m: save as mono
# -m s: save as stereo
# -m j: save as joint stereo (exploits inter-channel correlation
# more than regular stereo)
# -q 2: quality tweaking: the lower the value, the better the
# quality, but the slower the algorithm. Default is 5.
# By default, lame uses constant bit rate (CBR) encoding.
# You can also use average bit rate (ABR) encoding,
# e.g. for an average bit rate of 123 kbps:
lame --abr 123 input.wav output.mp3
# or variable (VBR) encoding, e.g. between 32 kbps and 192 kbps:
lame -v -b 32 -B 192 input.wav output.mp3
Encode in Ogg Vorbis format
With the "oggenc" tool you can encode audio in WAV format (or raw or AIFF) to Ogg Vorbis format. On Ubuntu I had to install the "vorbis-tools" package to get "oggenc".
# Minimal example
oggenc audio.wav -o audio.ogg
# Setting the bit rate, downmix to mono and set the sample rate:
oggenc -b 32 --downmix --resample 22050 input.wav -o output.ogg
Getting information about audio files
To get basic information about an audio file (like the number of channels, sample rate, duration, etc), there is the 'soxi' tool, which is part of the sox package:
soxi file.mp3
which returns something like:
Input File : 'file.mp3' Channels : 2 Sample Rate : 44100 Precision : 16-bit Duration : 00:03:55.35 = 10378847 samples = 17651.1 CDDA sectors File Size : 1.88M Bit Rate : 64.0k Sample Encoding: MPEG audio (layer I, II or III)
You can easily specify multiple files too.
When soxi is not available (e.g. it isn't on Ubuntu 8.04) or when soxi does not recognize the file format, there are some alternatives based on FFmpeg and MPlayer.
With FFmpeg, just don't specify an output file, for example:
ffmpeg -i file.mp3
which returns something like:
... [version information] ... Input #0, mp3, from 'file.mp3': Duration: 00:03:55.2, start: 0.000000, bitrate: 63 kb/s Stream #0.0: Audio: mp2, 44100 Hz, stereo, 64 kb/s Must supply at least one output file You can supply several files, but you need to put the flag -i in front of each one. With MPlayer, it's a bit more involved: mplayer -vo null -ao null -frames 0 -identify file.mp3 which returns something like: ... [version information] ... Playing file.mp3. ID_AUDIO_ID=0 Audio file file format detected. ID_FILENAME=file.mp3 ID_DEMUXER=audio ID_AUDIO_FORMAT=80 ID_AUDIO_BITRATE=64000 ID_AUDIO_RATE=44100 ID_AUDIO_NCH=0 ID_LENGTH=235.00 ======================================================== Forced audio codec: mad Opening audio decoder: [libmad] libmad mpeg audio decoder AUDIO: 44100 Hz, 2 ch, s16le, 64.0 kbit/4.54% (ratio: 8000->176400) ID_AUDIO_BITRATE=64000 ID_AUDIO_RATE=44100 ID_AUDIO_NCH=2 Selected audio codec: [mad] afm: libmad (libMAD MPEG layer 1-2-3) ======================================================== AO: [null] 44100Hz 2ch s16le (2 bytes per sample) ID_AUDIO_CODEC=mad Video: no video Starting playback... Exiting... (End of file) Further reading
15 Awesome Examples to Manipulate Audio Files Using Sound eXchange (SoX)
This article is part of the on-going Software for Geeks series. SoX stands for Sound eXchange. SoX is a cross-platform command line audio utility tool that works on Linux, Windows and MacOS. It is very helpful in the following areas while dealing with audio and music files.
- Audio File Converter
- Editing audio files
- Changing audio attributes
- Adding audio effects
- Plus lot of advanced sound manipulation features
In general, audio data is described by following four characteristics:
- Rate – The sample rate is in samples per second. For example, 44100/8000
- Data size – The precision the data is stored in. For example, 8/16 bits
- Data encoding – What encoding the data type uses. For example, u-law,a-law
- Channels – How many channels are contained in the audio data. For example, Stereo 2 channels
1. Combine Multiple Audio Files to Single File
-m, sox adds two input files together to produce its output. The example below adds first_part.wav and second_part.wav leaving the result in whole_part.wav. You can also use soxmix command for this purpose.
sox -m first_part.wav second_part.wav whole_part.wav soxmix first_part.wav second_part.wav whole_part.wav
2. Extract Part of the Audio File
Trim can trim off unwanted audio from the audio file.
sox old.wav new.wav trim [SECOND TO START] [SECONDS DURATION].
- SECOND TO START – Starting point in the voice file.
- SECONDS DURATION – Duration of voice file to remove.
The command below will extract first 10 seconds from input.wav and stored it in output.wav
sox input.wav output.wav trim 0 10
3. monter / baisser le volume
-v pour modifier le volume.
sox -v 2.0 foo.wav bar.wav # monter le volume
(-0.5) will be louder than (-0.1)
sox -v -0.5 in.wav out.wav # baisser le volume sox -v -0.1 in.wav out.wav # baisser encore plus
4. Get Audio File Information
stat can provide lot of statistical information about a given audio file. The -e flag tells sox not to generate any output other than the statistical information.
sox in.wav -e stat
Samples read: 3528000
Length (seconds): 40.000000
Scaled by: 2147483647.0
Maximum amplitude: 0.999969
Minimum amplitude: -1.000000
Midline amplitude: -0.000015
Mean norm: 0.217511
Mean amplitude: 0.003408
RMS amplitude: 0.283895
Maximum delta: 1.478455
Minimum delta: 0.000000
Mean delta: 0.115616
RMS delta: 0.161088
Rough frequency: 3982
Volume adjustment: 1.000
5. Play an Audio Song
Playing a sound file is accomplished by copying the file to the device special file /dev/dsp. The following command plays the file music.wav: Option -t specifies the type of the file /dev/dsp.
sox music.wav -t ossdsp /dev/dsp
You can also use play command to play the audio file as shown below.
play options Filename audio_effects play -r 8000 -w music.wav
6. Play an Audio Song Backwards
reverse to reverse the sound in a sound file.
sox input.wav output.wav reverse
You can also use play command to hear the song in reverse without modifying the source file as shown below.
play test.wav reverse
7. Record a Voice File
‘play’ and ‘rec’ commands are companion commands for sox . /dev/dsp is the digital sampling and digital recording device. Reading the device activates the A/D converter for sound recording and analysis. /dev/dsp file works for both playing and recording sound samples.
sox -t ossdsp /dev/dsp test.wav
You can also use rec command for recording voice. If SoX is invoked as ‘rec’ the default sound device is used as an input source.
rec -r 8000 -c 1 record_voice.wav
8. Changing the Sampling Rate of a Sound File
To change the sampling rate of a sound file, use option -r followed by the sample rate to use, in Hertz. Use the following example, to change the sampling rate of file ‘old.wav’ to 16000 Hz, and write the output to ‘new.wav’
sox old.wav -r 16000 new.wav
9. Changing the Sampling Size of a Sound File
If we increase the sampling size , we will get better quality. Sample Size for audio is most often expressed as 8 bits or 16 bits. 8bit audio is more often used for voice recording.
- -b Sample data size in bytes
- -w Sample data size in words
- -l Sample data size in long words
- -d Sample data size in double long words
The following example will convert 8-bit audio file to 16-bit audio file.
sox -b input.wav -w output.wav
10. Changing the Number of Channels
The following example converts mono audio files to stereo. Use Option -c to specify the number of channels .
sox mono.wav -c 2 stereo.wav
There are methods to convert stereo sound files to mono sound. i.e to get a single channel from stereo file.
Selecting a Particular Channel
This is done by using the avg effect with an option indicating what channel to use. The options are -l for left, -r for right, -f for front, and -b for back. Following example will extract the left channel
sox stereo.wav -c 1 mono.wav avg -l
Average the Channels
sox stereo.wav -c 1 mono.wav avg
11. Audio Converter – Music File Format Conversion
Sox is useful to convert one audio format to another. i.e from one encoding (ALAW, MP3) to another. Sox can recognize the input and desired output formats by parsing the file name extensions . It will take infile.ulaw and creates a GSM encoded file called outfile.gsm. You can also use sox to convert wav to mp3.
sox infile.ulaw outfile.gsm
If the file doesn’t have an extension in its name , using ‘-t’ option we can express our intention . Option -t is used to specify the encoding type .
sox -t ulaw infile -t gsm outfile
12. Generate Different Types of Sounds
Using synth effect we can generate a number of standard wave forms and types of noise. Though this effect is used to generate audio, an input file must still be given, ‘-n’ option is used to specify the input file as null file .
sox -n synth len type freq
- len – length of audio to synthesize. Format for specifying lengths in time is hh:mm:ss.frac
- type is one of sine, square, triangle, sawtooth, trapezium, exp, [white]noise, pinknoise, brown-
noise. Default is sine - freq – frequencies at the beginning/end of synthesis in Hz
The following example produces a 3 second 8000 kHz, audio file containing a sine-wave swept from 300 to 3300 Hz
sox -r 8000 -n output.au synth 3 sine 300-3300
13. Speed up the Sound in an Audio File
To speed up or slow down the sound of a file, use speed to modify the pitch and the duration of the file. This raises the speed and reduces the time. The default factor is 1.0 which makes no change to the audio. 2.0 doubles speed, thus time length is cut by a half and pitch is one interval higher.
sox input.wav output.wav speed factor sox input.wav output.wav speed 2.0
14. Multiple Changes to Audio File in Single Command
By default, SoX attempts to write audio data using the same data type, sample rate and channel count as per the input data. If the user wants the output file to be of a different format then user has to specify format options. If an output file format doesn’t support the same data type, sample rate, or channel count as the given input file format, then SoX will automatically select the closest values which it supports.
Converting a wav to raw. Following example convert sampling rate , sampling size , channel in single command line .
sox -r 8000 -w -c 1 -t wav source -r 16000 -b -c 2 -t raw destination
15. Convert Raw Audio File to MP3 Music File
There is no way to directly convert raw to mp3 file because mp3 will require compression information from raw file . First we need to convert raw to wav. And then convert wav to mp3. In the exampe below, option -h indicates high quality.
Convert Raw Format to Wav Format:
sox -w -c 2 -r 8000 audio1.raw audio1.wav
Conver Wav Format to MP3 Format:
lame -h audio1.wav audio1.mp3
Need some help. I am trying to execute the soxmix command from a php call, but it does nothing… can you show me how to do it?
$output = array(); $result = -1; exec(‘/usr/bin/soxmix test/exports/audiotest1.wav audio/003-half.wav test/exports/mixedaudio.wav’, $output, $result); var_dump($output, $result);
Your instructions for decreasing the volume are incorrect. Positive numbers less the 1 decrease the volume. You can make those numbers negative, but that will just invert the phase; it has no effect on volume. I would suggest you replace your examples with these:
sox -v 0.5 srcfile.wav test05.wav sox -v 0.1 srcfile.wav test01.wav
The first command reduces the gain by 50%, the second by 90%.
--
sox YourInputFilename -n stats
-n means “null output”, in other words, just provide the stats and don’t make a new audio file.
If you cannot convert mp3 files, install this package:
sudo apt-get install libsox-fmt-mp3
trim only parts over a certain level ?
sox -V -S -b32 inputfile.mp3 outputfile.wv norm -3 rate -vMa 28224k rate -vMa 192k
simple example for upsample and harmonize, allow aliasing for higher harmonics growing, distortion of aliasing phenomena is reduced trought very high intermodulate rate and this technique produce best sound coloring what i ever hear