Reputation: 1053
I want to know how to get samples out of a .wav file in order to perform windowed join of two .wav files.
Can any one please tell how to do this?
Upvotes: 5
Views: 14864
Reputation: 4063
After reading the samples (for example with the wave module, more details here) you may want to have the values scales between -1 and 1 (this is the convention for audio signals).
In this case, you can add:
# scale to -1.0 -- 1.0
max_nb_bit = float(2**(nb_bits-1))
samples = signal_int / (max_nb_bit + 1.0)
with nb_bits
the bit depth and signal_int
the integers values.
Upvotes: 0
Reputation: 48436
Here's a function to read samples from a wave file (tested with mono & stereo):
def read_samples(wave_file, nb_frames):
frame_data = wave_file.readframes(nb_frames)
if frame_data:
sample_width = wave_file.getsampwidth()
nb_samples = len(frame_data) // sample_width
format = {1:"%db", 2:"<%dh", 4:"<%dl"}[sample_width] % nb_samples
return struct.unpack(format, frame_data)
else:
return ()
And here's the full script that does windowed mixing or concatenating of multiple .wav
files. All input files need to have the same params (# of channels and sample width).
import argparse
import itertools
import struct
import sys
import wave
def _struct_format(sample_width, nb_samples):
return {1:"%db", 2:"<%dh", 4:"<%dl"}[sample_width] % nb_samples
def _mix_samples(samples):
return sum(samples)//len(samples)
def read_samples(wave_file, nb_frames):
frame_data = wave_file.readframes(nb_frames)
if frame_data:
sample_width = wave_file.getsampwidth()
nb_samples = len(frame_data) // sample_width
format = _struct_format(sample_width, nb_samples)
return struct.unpack(format, frame_data)
else:
return ()
def write_samples(wave_file, samples, sample_width):
format = _struct_format(sample_width, len(samples))
frame_data = struct.pack(format, *samples)
wave_file.writeframes(frame_data)
def compatible_input_wave_files(input_wave_files):
nchannels, sampwidth, framerate, nframes, comptype, compname = input_wave_files[0].getparams()
for input_wave_file in input_wave_files[1:]:
nc,sw,fr,nf,ct,cn = input_wave_file.getparams()
if (nc,sw,fr,ct,cn) != (nchannels, sampwidth, framerate, comptype, compname):
return False
return True
def mix_wave_files(output_wave_file, input_wave_files, buffer_size):
output_wave_file.setparams(input_wave_files[0].getparams())
sampwidth = input_wave_files[0].getsampwidth()
max_nb_frames = max([input_wave_file.getnframes() for input_wave_file in input_wave_files])
for frame_window in xrange(max_nb_frames // buffer_size + 1):
all_samples = [read_samples(wave_file, buffer_size) for wave_file in input_wave_files]
mixed_samples = [_mix_samples(samples) for samples in itertools.izip_longest(*all_samples, fillvalue=0)]
write_samples(output_wave_file, mixed_samples, sampwidth)
def concatenate_wave_files(output_wave_file, input_wave_files, buffer_size):
output_wave_file.setparams(input_wave_files[0].getparams())
sampwidth = input_wave_files[0].getsampwidth()
for input_wave_file in input_wave_files:
nb_frames = input_wave_file.getnframes()
for frame_window in xrange(nb_frames // buffer_size + 1):
samples = read_samples(input_wave_file, buffer_size)
if samples:
write_samples(output_wave_file, samples, sampwidth)
def argument_parser():
parser = argparse.ArgumentParser(description='Mix or concatenate multiple .wav files')
parser.add_argument('command', choices = ("mix", "concat"), help='command')
parser.add_argument('output_file', help='ouput .wav file')
parser.add_argument('input_files', metavar="input_file", help='input .wav files', nargs="+")
parser.add_argument('--buffer_size', type=int, help='nb of frames to read per iteration', default=1000)
return parser
if __name__ == '__main__':
args = argument_parser().parse_args()
input_wave_files = [wave.open(name,"rb") for name in args.input_files]
if not compatible_input_wave_files(input_wave_files):
print "ERROR: mixed wave files must have the same params."
sys.exit(2)
output_wave_file = wave.open(args.output_file, "wb")
if args.command == "mix":
mix_wave_files(output_wave_file, input_wave_files, args.buffer_size)
elif args.command == "concat":
concatenate_wave_files(output_wave_file, input_wave_files, args.buffer_size)
output_wave_file.close()
for input_wave_file in input_wave_files:
input_wave_file.close()
Upvotes: 2
Reputation: 881497
The wave module of the standard library is the key: after of course import wave
at the top of your code, wave.open('the.wav', 'r')
returns a "wave read" object from which you can read frames with the .readframes
method, which returns a string of bytes which are the samples... in whatever format the wave file has them (you can determine the two parameters relevant to decomposing frames into samples with the .getnchannels
method for the number of channels, and .getsampwidth
for the number of bytes per sample).
The best way to turn the string of bytes into a sequence of numeric values is with the array
module, and a type of (respectively) 'B'
, 'H'
, 'L'
for 1, 2, 4 bytes per sample (on a 32-bit build of Python; you can use the itemsize
value of your array object to double-check this). If you have different sample widths than array
can provide you, you'll need to slice up the byte string (padding each little slice appropriately with bytes worth 0) and use the struct module instead (but that's clunkier and slower, so use array
instead if you can).
Upvotes: 13
Reputation: 41306
You can use the wave
module. First you should read the metadata, such us sample size or the number of channels. Using the readframes()
method, you can read samples, but only as a byte string. Based on the sample format, you have to convert them to samples using struct.unpack()
.
Alternatively, if you want the samples as an array of floating-point numbers, you can use SciPy's io.wavfile
module.
Upvotes: 2