从移动设备采集音频数据并进行分析
此示例显示如何使用 FFT(快速傅里叶变换)分析 MATLAB Mobile 采集的麦克风音频数据。需要准备一台“传感器访问”处于开启状态的移动设备。要开启“传感器访问”,请在 MATLAB Mobile 中转至“传感器”>“更多”,然后打开“传感器访问”。
连接到设备
建立与移动设备的连接并启用麦克风。
mobileDevObject = mobiledev; mobileDevObject.MicrophoneEnabled = 1;
录制背景声音
打开麦克风以开始录制背景噪声。读取录制的两秒钟背景音频数据。录制的数据是一个双精度值矩阵,大小为采样数 × 通道数。找到最大值以设置背景噪声与检测到的语音之间的阈值。
mobileDevObject.logging = 1; disp('Sampling background noise...') pause(2) mobileDevObject.logging = 0; audioData = readAudio(mobileDevObject); disp('Maximum sound of the background noise: ') threshold = max(abs(audioData), [], "all")
Sampling background noise... Maximum sound of the background noise: threshold = 0.0122
录制语音
开始语音录制。
disp('Speak into the device microphone for a few seconds. For example, say: ') mobileDevObject.logging = 1; tic disp('"Testing MATLAB Mobile Audio"') startTime = 0; totalAudio = [];
Speak into the device microphone for a few seconds. For example, say: "Testing MATLAB Mobile Audio"
检测语音以触发数据采集
尝试检测 5 秒钟的语音。每 200 毫秒暂停一次并读取缓冲区。如果时间窗的最大值大于阈值的 1.5 倍,则丢弃前面采集的背景音频数据并开始采集想要的语音数据。如果未检测到语音,则处理最后 5 秒钟收集到的音频数据。
while toc < 5 && startTime == 0 pause(.2) audioData = readAudio(mobileDevObject); if max(abs(audioData)) > threshold * 1.5 startTime = toc totalAudio = audioData; else totalAudio = vertcat(totalAudio, audioData); end end
startTime = 1.4202
采集音频数据
每 200 毫秒暂停一次并读取缓冲区。采集音频数据直到语音结束或达到超时限制。如果在 400 毫秒内未检测到语音,则终止采集。
if startTime ~= 0 numberOfIntervalsStopped = 0; while numberOfIntervalsStopped < 2 && toc < 10 pause(.2) audioData = readAudio(mobileDevObject); if max(abs(audioData)) < threshold * 1.5 numberOfIntervalsStopped = numberOfIntervalsStopped + 1; else numberOfIntervalsStopped = 0; end totalAudio = vertcat(totalAudio,audioData); end end mobileDevObject.logging = 0;
预处理音频数据
只需要一个通道的数据。n 是 leftAudio 的大小,用于数据成像和处理。获取麦克风采样率以在稍后确定频率刻度。
endTime = toc; leftAudio = totalAudio(:,1); n = numel(leftAudio); if n == 0 disp(' ') disp('No audio data recorded. Try to run the script again.') clear mobileDevObject return end sampleRate = mobileDevObject.Microphone.SampleRate;
绘制时域音频数据
使用历时确定图上刻度的时间戳。将时间戳转换为其对应的采样以找到它们在 x 轴上的位置。用 xticks 函数显示这些值。使用标签的原始刻度数组。
figure(1); plot(leftAudio) title('Sound wave'); timeElapsed = endTime - startTime ticks = 0:floor(timeElapsed); sampleTicks = ticks * n/timeElapsed; xticks(sampleTicks) xticklabels(ticks) xlabel('Time(s)') ylabel('Amplitude')
timeElapsed = 8.7632
处理频域音频数据
使用 fft 函数将振幅转换为给定原始时域数据的频域。
fftData = fft(leftAudio); % Signal length is equal to the number of samples. signalLength = n; % Normalize the FFT data by dividing by signalLength. fftNormal = abs(fftData/signalLength); % The second half of the FFT data is a reflection of the first half % and is not relevant in this case, so remove those values. fftNormal = fftNormal(1:floor(signalLength/2)+1); % Multiply the final values by 2 to account for removed values. fftNormal(2:end-1) = 2*fftNormal(2:end-1); % freqs is the x-axis scale of the graph. freqs = sampleRate*(0:(signalLength/2))/signalLength; % Convert factor from index to frequency. scale = sampleRate/signalLength;
绘制 0-1000 Hz 范围内的频域音频数据
cutoff = 1000/scale; figure(2); plot(freqs(1:floor(cutoff)),fftNormal(1:floor(cutoff))) title("Frequency Domain Graph") xlabel("Frequency (Hz)") ylabel("Amplitude") ax = gca; ax.XAxis.Exponent = 0;
最终频率分析和清理
输出主导频率,该频率是来自 fft 的最大振幅的索引。使用计算的刻度将该值转换为 Hz。
[mVal, mInd] = max(fftNormal); fprintf("Dominant frequency: %d Hz\n",floor(mInd * scale)); if startTime == 0 disp(' ') disp('The voice of the speech is too low compared to the background noise, analysis might not be precise. Try to run the script again and speak louder.'); end clear mobileDevObject
Dominant frequency: 125 Hz