Main Content

本页翻译不是最新的。点击此处可查看最新英文版本。

使用查找表实现定点平方根

此示例说明如何使用查找表实现定点平方根。查找表可为嵌入式设备生成高效代码。

设置

为了确保此示例不会更改您的预设项或设置,以下代码存储原始状态。

originalFormat = get(0,'format'); format long g
originalWarningState = warning('off','fixed:fi:underflow');
originalFiprefState = get(fipref); reset(fipref)

您将在示例末尾处还原此状态。

平方根实现

下面概要介绍了平方根算法。

  1. 将字节中的位数 B 声明为常量。在此示例中,B = 8

  2. 使用示例Normalize Data for Lookup Tables中所述的函数 fi_normalize_unsigned_8_bit_byte() 对输入 u > 0 进行正规化,使得 u = x*2^n0.5 <= x < 2n 为偶数。

  3. 提取 x 的高 B 位。设 x_B 表示 x 的高 B 位。

  4. 生成查找表 SQRTLUT,使整数 i = x_B- 2^(B-2) + 1 用作 SQRTLUT 的索引,从而可以通过查找索引 sqrt(x_B) = SQRTLUT(i) 来计算 sqrt(x_B)

  5. 使用差值 r = x - x_B(解释为小数),在 SQRTLUT(i) 和表 SQRTLUT(i+1) 中的下一个值之间进行线性插值。该差值 r 是通过提取 x 的低 w - B 位创建的,其中 w 表示 x 的字长。它使用函数 reinterpretcast() 解释为小数。

  6. 最后,使用查找表和线性插值计算输出:

sqrt(u) = sqrt(x*2^n)

= sqrt(x)*2^(n/2)

= (SQRTLUT(i) + r*(SQRTLUT(i+1) - SQRTLUT(i)))*2^(n/2)

此示例末尾定义的函数 fi_sqrtlookup_8_bit_byte() 实现此算法。

示例

使用 fi_sqrtlookup_8_bit_byte() 通过查找表计算定点平方根。将定点查找表结果与使用 sqrt 和双精度计算的平方根进行比较。

u = fi(linspace(0,128,1000),0,16,12);
y = fi_sqrtlookup_8_bit_byte(u);
y_expected = sqrt(double(u));

绘制结果。

clf
subplot(211)
plot(u,y,u,y_expected)
legend('Output','Expected output','Location','Best')

subplot(212)
plot(u,double(y)-y_expected,'r')
legend('Error')

Figure contains 2 axes objects. Axes object 1 contains 2 objects of type line. These objects represent Output, Expected output. Axes object 2 contains an object of type line. This object represents Error.

figure(gcf)

清理

还原原始状态。

set(0,'format',originalFormat);
warning(originalWarningState);
fipref(originalFiprefState);

sqrt_lookup_table 函数定义

函数 sqrt_lookup_table 加载平方根值的查找表。您可以通过运行以下代码来创建该表:

sqrt_table = sqrt((2^(B-2):2^(B))/2^(B-1));

function SQRTLUT = sqrt_lookup_table()
    B = 8;  % Number of bits in a byte
    % sqrt_table = sqrt((2^(B-2):2^(B))/2^(B-1))
    sqrt_table = [0.707106781186548   0.712609640686961   0.718070330817254   0.723489806424389 ...
                  0.728868986855663   0.734208757779421   0.739509972887452   0.744773455488312 ...
                  0.750000000000000   0.755190373349661   0.760345316287277   0.765465544619743 ...
                  0.770551750371122   0.775604602874429   0.780624749799800   0.785612818123533 ...
                  0.790569415042095   0.795495128834866   0.800390529679106   0.805256170420320 ...
                  0.810092587300983   0.814900300650331   0.819679815537750   0.824431622392057 ...
                  0.829156197588850   0.833854004007896   0.838525491562421   0.843171097702003 ...
                  0.847791247890659   0.852386356061616   0.856956825050130   0.861503047005639 ...
                  0.866025403784439   0.870524267324007   0.875000000000000   0.879452954966893 ...
                  0.883883476483184   0.888291900221993   0.892678553567856   0.897043755900458 ...
                  0.901387818865997   0.905711046636840   0.910013736160065   0.914296177395487 ...
                  0.918558653543692   0.922801441264588   0.927024810886958   0.931229026609459 ...
                  0.935414346693485   0.939581023648307   0.943729304408844   0.947859430506444 ...
                  0.951971638232989   0.956066158798647   0.960143218483576   0.964203038783845 ...
                  0.968245836551854   0.972271824131503   0.976281209488332   0.980274196334883 ...
                  0.984250984251476   0.988211768802619   0.992156741649222   0.996086090656827 ...
                  1.000000000000000   1.003898650263063   1.007782218537319   1.011650878514915 ...
                  1.015504800579495   1.019344151893756   1.023169096484056   1.026979795322186 ...
                  1.030776406404415   1.034559084827928   1.038327982864759   1.042083250033317 ...
                  1.045825033167594   1.049553476484167   1.053268721647045   1.056970907830485 ...
                  1.060660171779821   1.064336647870400   1.068000468164691   1.071651762467640 ...
                  1.075290658380328   1.078917281352004   1.082531754730548   1.086134199811423 ...
                  1.089724735885168   1.093303480283494   1.096870548424015   1.100426053853688 ...
                  1.103970108290981   1.107502821666834   1.111024302164449   1.114534656257938 ...
                  1.118033988749895   1.121522402807898   1.125000000000000   1.128466880329237 ...
                  1.131923142267177   1.135368882786559   1.138804197393037   1.142229180156067 ...
                  1.145643923738960   1.149048519428140   1.152443057161611   1.155827625556683 ...
                  1.159202311936963   1.162567202358642   1.165922381636102   1.169267933366857 ...
                  1.172603939955857   1.175930482639174   1.179247641507075   1.182555495526531 ...
                  1.185854122563142   1.189143599402528   1.192424001771182   1.195695404356812 ...
                  1.198957880828180   1.202211503854459   1.205456345124119   1.208692475363357 ...
                  1.211919964354082   1.215138880951474   1.218349293101120   1.221551267855754 ...
                  1.224744871391589   1.227930169024281   1.231107225224513   1.234276103633219 ...
                  1.237436867076458   1.240589577579950   1.243734296383275   1.246871083953750 ...
                  1.250000000000000   1.253121103485214   1.256234452640111   1.259340104975618 ...
                  1.262438117295260   1.265528545707287   1.268611445636527   1.271686871835988 ...
                  1.274754878398196   1.277815518766305   1.280868845744950   1.283914911510884 ...
                  1.286953767623375   1.289985465034393   1.293010054098575   1.296027584582983 ...
                  1.299038105676658   1.302041665999979   1.305038313613819   1.308028096028522 ...
                  1.311011060212689   1.313987252601790   1.316956719106592   1.319919505121430 ...
                  1.322875655532295   1.325825214724777   1.328768226591831   1.331704734541407 ...
                  1.334634781503914   1.337558409939543   1.340475661845451   1.343386578762792 ...
                  1.346291201783626   1.349189571557681   1.352081728298996   1.354967711792425 ...
                  1.357847561400027   1.360721316067327   1.363589014329464   1.366450694317215 ...
                  1.369306393762915   1.372156150006259   1.375000000000000   1.377837980315538 ...
                  1.380670127148408   1.383496476323666   1.386317063301177   1.389131923180804 ...
                  1.391941090707505   1.394744600276337   1.397542485937369   1.400334781400505 ...
                  1.403121520040228   1.405902734900249   1.408678458698081   1.411448723829527 ...
                  1.414213562373095];
    % Cast to fixed point with the most accurate rounding method
    WL = 4*B;  % Word length
    FL = 2*B;  % Fraction length
    SQRTLUT = fi(sqrt_table,1,WL,FL,'RoundingMethod','Nearest');
    % Set fimath for the most efficient math operations
    F = fimath('OverflowAction','Wrap',...
               'RoundingMethod','Floor',...
               'SumMode','KeepLSB',...
               'SumWordLength',WL,...
               'ProductMode','KeepLSB',...
               'ProductWordLength',WL);
    SQRTLUT = setfimath(SQRTLUT,F);
end

fi_sqrtlookup_8_bit_byte() 函数定义

function y = fi_sqrtlookup_8_bit_byte(u)
    % Load the lookup table
    SQRTLUT = sqrt_lookup_table();
    % Remove fimath from the input to insulate this function from math
    % settings declared outside this function.
    u = removefimath(u);
    % Declare the output
    y = coder.nullcopy(fi(zeros(size(u)),numerictype(SQRTLUT),fimath(SQRTLUT)));
    B = 8; % Number of bits in a byte
    w = u.WordLength;
    for k = 1:numel(u)
        assert(u(k)>=0,'Input must be non-negative.');
        if u(k)==0
            y(k)=0;
        else
            % Normalize the input such that u = x*2^n and 0.5 <= x < 2
            [x,n] = fi_normalize_unsigned_8_bit_byte(u(k));
            isodd = storedInteger(bitand(fi(1,1,8,0),fi(n)));
            x = bitsra(x,isodd);
            n = n + isodd;
            % Extract the high byte of x
            high_byte = storedInteger(bitsliceget(x,w,w-B+1));
            % Convert the high byte into an index for SQRTLUT
            i =  high_byte - 2^(B-2) + 1;
            % The upper byte was used for the index into SQRTLUT.
            % The remainder, r, interpreted as a fraction, is used to
            % linearly interpolate between points.
            T_unsigned_fraction = numerictype(0,w-B,w-B);
            r = reinterpretcast(bitsliceget(x,w-B,1),T_unsigned_fraction);
            y(k) = bitshift((SQRTLUT(i) + r*(SQRTLUT(i+1) - SQRTLUT(i))),...
                            bitsra(n,1));
        end
    end
    % Remove fimath from the output to insulate the caller from math settings
    % declared inside this function.
    y = removefimath(y);
end