Main Content

从移动设备采集音频数据并进行分析

此示例显示如何使用 FFT(快速傅里叶变换)分析 MATLAB Mobile 采集的麦克风音频数据。需要准备一台“传感器访问”处于开启状态的移动设备。要开启“传感器访问”,请在 MATLAB Mobile 中转至“传感器”>“更多”,然后打开“传感器访问”。

连接到设备

建立与移动设备的连接并启用麦克风。

mobileDevObject = mobiledev;
mobileDevObject.MicrophoneEnabled = 1;

录制背景声音

打开麦克风以开始录制背景噪声。读取录制的两秒钟背景音频数据。录制的数据是一个双精度值矩阵,大小为采样数 × 通道数。找到最大值以设置背景噪声与检测到的语音之间的阈值。

mobileDevObject.logging = 1;
disp('Sampling background noise...')
pause(2)
mobileDevObject.logging = 0;
audioData = readAudio(mobileDevObject);
disp('Maximum sound of the background noise: ')
threshold = max(abs(audioData), [], "all")
Sampling background noise...
Maximum sound of the background noise: 

threshold =

    0.0122

录制语音

开始语音录制。

disp('Speak into the device microphone for a few seconds. For example, say: ')
mobileDevObject.logging = 1;
tic
disp('"Testing MATLAB Mobile Audio"')
startTime = 0;
totalAudio = [];
Speak into the device microphone for a few seconds. For example, say: 
"Testing MATLAB Mobile Audio"

检测语音以触发数据采集

尝试检测 5 秒钟的语音。每 200 毫秒暂停一次并读取缓冲区。如果时间窗的最大值大于阈值的 1.5 倍,则丢弃前面采集的背景音频数据并开始采集想要的语音数据。如果未检测到语音,则处理最后 5 秒钟收集到的音频数据。

while toc < 5 && startTime == 0
    pause(.2)
    audioData = readAudio(mobileDevObject);
    if max(abs(audioData)) > threshold * 1.5
        startTime = toc
        totalAudio = audioData;
    else
        totalAudio = vertcat(totalAudio, audioData);
    end
end
startTime =

    1.4202

采集音频数据

每 200 毫秒暂停一次并读取缓冲区。采集音频数据直到语音结束或达到超时限制。如果在 400 毫秒内未检测到语音,则终止采集。

if startTime ~= 0
    numberOfIntervalsStopped = 0;
    while numberOfIntervalsStopped < 2 && toc < 10
        pause(.2)
        audioData = readAudio(mobileDevObject);
        if max(abs(audioData)) < threshold * 1.5
            numberOfIntervalsStopped = numberOfIntervalsStopped + 1;
        else
            numberOfIntervalsStopped = 0;
        end
        totalAudio = vertcat(totalAudio,audioData);
    end
end
mobileDevObject.logging = 0;

预处理音频数据

只需要一个通道的数据。n 是 leftAudio 的大小,用于数据成像和处理。获取麦克风采样率以在稍后确定频率刻度。

endTime = toc;
leftAudio = totalAudio(:,1);
n = numel(leftAudio);
if n == 0
    disp(' ')
    disp('No audio data recorded. Try to run the script again.')
    clear mobileDevObject
    return
end
sampleRate = mobileDevObject.Microphone.SampleRate;

绘制时域音频数据

使用历时确定图上刻度的时间戳。将时间戳转换为其对应的采样以找到它们在 x 轴上的位置。用 xticks 函数显示这些值。使用标签的原始刻度数组。

figure(1);
plot(leftAudio)
title('Sound wave');
timeElapsed = endTime - startTime
ticks = 0:floor(timeElapsed);
sampleTicks = ticks * n/timeElapsed;
xticks(sampleTicks)
xticklabels(ticks)
xlabel('Time(s)')
ylabel('Amplitude')
timeElapsed =

    8.7632

处理频域音频数据

使用 fft 函数将振幅转换为给定原始时域数据的频域。

fftData = fft(leftAudio);
% Signal length is equal to the number of samples.
signalLength = n;
% Normalize the FFT data by dividing by signalLength.
fftNormal = abs(fftData/signalLength);
% The second half of the FFT data is a reflection of the first half
% and is not relevant in this case, so remove those values.
fftNormal = fftNormal(1:floor(signalLength/2)+1);
% Multiply the final values by 2 to account for removed values.
fftNormal(2:end-1) = 2*fftNormal(2:end-1);
% freqs is the x-axis scale of the graph.
freqs = sampleRate*(0:(signalLength/2))/signalLength;
% Convert factor from index to frequency.
scale = sampleRate/signalLength;

绘制 0-1000 Hz 范围内的频域音频数据

cutoff = 1000/scale;
figure(2);
plot(freqs(1:floor(cutoff)),fftNormal(1:floor(cutoff)))
title("Frequency Domain Graph")
xlabel("Frequency (Hz)")
ylabel("Amplitude")
ax = gca;
ax.XAxis.Exponent = 0;

最终频率分析和清理

输出主导频率,该频率是来自 fft 的最大振幅的索引。使用计算的刻度将该值转换为 Hz。

[mVal, mInd] = max(fftNormal);
fprintf("Dominant frequency: %d Hz\n",floor(mInd * scale));
if startTime == 0
    disp(' ')
    disp('The voice of the speech is too low compared to the background noise, analysis might not be precise. Try to run the script again and speak louder.');
end
clear mobileDevObject
Dominant frequency: 125 Hz