Форум ReidS Soft

Информация о пользователе

Привет, Гость! Войдите или зарегистрируйтесь.


Вы здесь » Форум ReidS Soft » Flite+hts_engine » Getting started with STRAIGHT in command mode


Getting started with STRAIGHT in command mode

Сообщений 1 страница 4 из 4

1

Getting started with STRAIGHT in command mode

URL: http://indiana.edu/~acoustic/s522/getti … 0_006b.pdf
Размер: 1,9 MБ (1 908 838 байт)

...
Using default parameters
This section introduces the simplest analysis and re-synthesis procedure.
Reading speech from a file
The first step is to read a file consisting of a speech signal. A file vaiueo2d.wav is used

to illustrate the following
steps. The file is located in the STRAIGHT directory. The following command reads the speech

signal into the
variable x and sets the sampling frequency (Hz) to fs.

[x,fs]=wavread(’vaiueo2d.wav’);

Extracting excitation source parameters
The next step is to extract source parameters. Source parameter consists of a fundamental

frequency f0raw and
periodicity indices ap at each frequency band. The following command does it.

[f0raw,ap]=exstraightsource(x,fs);

The fundamental frequency information is used to guide the following spectral information

extraction.
Extracting smoothed time-frequency representation (STRAIGHT spectrum)
A time-frequency representation n3sgram, that is an F0 adaptively smoothed spectrogram, is

extracted using the
following command.

n3sgram=exstraightspec(x,f0raw,fs);

The algorithm for eliminating interferences caused by a periodic excitation based on an

extended pitch synchronous
analysis and an optimum F0 adaptive smoothing based on a spline theory is the core of

STRAIGHT (and is a stateless
algorithm). Let’s call the time-frequency representation as STRAIGHT spectrogram later on.

Accordingly, let’s
call its time slice as a STRAIGHT spectrum.
Re-synthesizing speech
Speech re-synthesis without modification is just calling the following command.

sy = exstraightsynth(f0raw,n3sgram,ap,fs);

The synthesized speech signal is stored in the variable sy. The following Matlab command

reproduces it from an
audio output.

soundsc(sy,fs);

Скрипт для Matlab 7.0:  synthes.m
Путь D:\HTS-demo_CMU_2\data\synthes.m

% Можно использовать любой  wav-файл (здесь с частотой дискретизации 48000 Гц)

path(path,'C:\usr\local\STRAIGHTtrial\Resources\STRAIGHTV40pcode\STRAIGHTV40pcode');

fprintf(1,'Synthesizing ./gen/cmu_us_arctic_slt_a0021.wav\n');

[x,fs]=wavread('./wav/cmu_us_arctic_slt_a0021.wav');   %Reading speech from a file
[f0raw,ap]=exstraightsource(x,fs);     % Extracting excitation source parameters
n3sgram=exstraightspec(x,f0raw,fs);    %Extracting smoothed time-frequency representation

(STRAIGHT spectrum)
    % Re-synthesizing speech
    % Speech re-synthesis without modification is just calling the following command.
sy = exstraightsynth(f0raw,n3sgram,ap,fs);

soundsc(sy,fs); % sound
break; % прервать скрипт
quit; % выход из матлаб


Скрипт для Matlab 7.0:  synthes2.m
Путь D:\HTS-demo_CMU_2\data\synthes2.m

path(path,'C:\usr\local\STRAIGHTtrial\Resources\STRAIGHTV40pcode\STRAIGHTV40pcode');
prm.spectralUpdateInterval = 5.000000;
prm.F0searchUpperBound=280;
prm.F0searchLowerBound=110;

fprintf(1,'Synthesizing ./gen/cmu_us_arctic_slt_a0021.wav\n');

[x,fs]=wavread('./wav/cmu_us_arctic_slt_a0021.wav');   %Reading speech from a file
[f0raw,ap]=exstraightsource(x,fs,prm);     % Extracting excitation source parameters
n3sgram=exstraightspec(x,f0raw,fs,prm);    % Extracting smoothed time-frequency representation (STRAIGHT spectrum)
    % Re-synthesizing speech
    % Speech re-synthesis without modification is just calling the following command.
sy = exstraightsynth(f0raw,n3sgram,ap,fs,prm);

soundsc(sy,fs); % sound

% [sy] = exstraightsynth(f0raw,n3sgram,ap,48000,prm); % it work
wavwrite( sy/max(abs(sy)), 48000, './gen/cmu_us_arctic_slt_a0021.wav');




----------------
файлы ap, sp, f0 в текстовом формате должны быть преобразованы в формат float le

./gen/_txtap_2_ap.bat
x2x +af < ../ap/cmu_us_arctic_slt_a0021.ap > cmu_us_arctic_slt_a0021.ap

./gen/_txtap_2_ap.bat
x2x +af < ../ap/cmu_us_arctic_slt_a0021.ap > cmu_us_arctic_slt_a0021.ap

./gen/_txtsp_2_sp.bat
x2x +af < ../sp/cmu_us_arctic_slt_a0021.sp > cmu_us_arctic_slt_a0021.sp

Скрипт для Matlab 7.0:  synthes3.m
Путь D:\HTS-demo_CMU_2\data\synthes3.m
path(path,'C:\usr\local\STRAIGHTtrial\Resources\STRAIGHTV40pcode\STRAIGHTV40pcode');
prm.spectralUpdateInterval = 5.000000;
prm.F0searchUpperBound=280;
prm.F0searchLowerBound=110;

fprintf(1,'Synthesizing ./gen/cmu_us_arctic_slt_a0021.wav\n');

% [x,fs]=wavread('./wav/cmu_us_arctic_slt_a0021.wav');   %Reading speech from a file
% [f0raw,ap]=exstraightsource(x,fs,prm);     % Extracting excitation source parameters
% n3sgram=exstraightspec(x,f0raw,fs,prm);    % Extracting smoothed time-frequency representation (STRAIGHT spectrum)
% save './gen/cmu_us_arctic_slt_a0021_ascii.sp' n3sgram -ascii;
% save './gen/cmu_us_arctic_slt_a0021_ascii.f0' f0raw -ascii;
% save './gen/cmu_us_arctic_slt_a0021_ascii.ap' ap -ascii;

% %файлы в текстовом формате должны быть преобразованы в формат float le

fid1 = fopen('./gen/cmu_us_arctic_slt_a0021.sp','r','ieee-le');   
n3sgram = fread(fid1,[1025, 500],'float');
fclose(fid1);

fid2 = fopen('./gen/cmu_us_arctic_slt_a0021.ap','r','ieee-le');   
ap = fread(fid1,[1025, 500],'float');
fclose(fid2);

fid3 = fopen('./gen/cmu_us_arctic_slt_a0021.f0','r','ieee-le');
f0raw = fread(fid3,[1, 500],'float32');
fclose(fid3);

%break;

   % Re-synthesizing speech
    % Speech re-synthesis without modification is just calling the following command.
sy = exstraightsynth(f0raw,n3sgram,ap,fs,prm);

soundsc(sy,fs); % sound

% [sy] = exstraightsynth(f0raw,n3sgram,ap,48000,prm);
wavwrite( sy/max(abs(sy)), 48000, './gen/cmu_us_arctic_slt_a0021.wav');

Отредактировано Inprj21 (2015-03-05 05:24:49)

0

2

500 - кол-во значений в файле cmu_us_arctic_slt_a0021.f0
Размерности массивов sp и ap в версии TRAIGHTV40pcode\STRAIGHTV40pcode должны совпадать. В данном случае их значения  равны [1025,500].
Размеры файлов  cmu_us_arctic_slt_a0021.sp и cmu_us_arctic_slt_a0021.ap в  формате float le равны 2,002 МБ.

Обратное преобразование файлов:

f0 --> +af -->  lf0,    те lf0 - лог f0,  формат float. Пре
lf0 - это ascii файл f0, преобразованный в двоичный формат.

Преобразование mgc (мел-кепстральных коэф-в) обратно в спектр:
MGC2SP -a 0.42 -g 0 -m 34 -l 2048 -o 2 ../mgc/cmu_us_arctic_slt_a0021.mgc > cmu_us_arctic_slt_a0021.sp
pause
На выходе выдается тот же файл sp. При создании mgc задан размер окна 2048.                 
Если создавать mgc с размером окна 1024, то после преобразования получится аналогичный результат:
MGC2SP -a 0.42 -g 0 -m 34 -l 1024 -o 2 ../mgc/cmu_us_arctic_slt_a0021.mgc > cmu_us_arctic_slt_a0021.sp
pause

bound ap -> ap
На выходе выдается ap-файл, размер которого в 2 раза меньше размера исходного.
Или изменяется формат, или теряется часть информации.
В итоге, при использовании этого ap-файла синтез невозможен, тк скрипт синтеза wav выдает ошибку.

Marilyn Manson     'Holy Wood'   -      Культовая вещь.
Manson! Manson! Manson!

0

3

В итоге, скрипт синтеза wav выдавал ошибку из-за вызова утилиты bcp из набора Cygwin вместо одноименной утилиты SPTK.

# setting
SPEAKER=slt #@SPEAKER@
DATASET="cmu_us_arctic" #@DATASET@

# awk and perl
AWK=AWK
PERL='E:\MATLAB701\sys\perl\win32\bin\perl.exe' # PERL

SORT1='c:/usr/local/wbin/sort'

#echo ${PERL}

# SPTK commands
X2X=X2X
MGCEP=MGCEP
LPC2LSP=LPC2LSP
BCP='c:/usr/local/sptk/bin/bcp.exe'
AVERAGE=AVERAGE
MERGE=MERGE
VSTAT=VSTAT
NAN=NAN
MINMAX=MINMAX

# MATLAB and STRAIGHT
MATLAB=E:/MATLAB701/bin/win32/MATLAB.exe
STRAIGHT='C:/usr/local/STRAIGHTtrial/Resources/STRAIGHTV40pcode/STRAIGHTV40pcode'

# dumpfeats to extract utterance information
DUMPFEATS='C:/festival/bin/festival.exe --script c:/usr/local/festival/examples/dumpfeats.sh'

# SOX to convert raw audio to RIFF wav
SOX=SOX/sox
SOXOPTION="b 16" #SOXOPTION

# speech analysis conditions
SAMPFREQ2=48000 #@SAMPFREQ@   # Sampling frequency (48kHz)
SAMPFREQ=48000
FRAMESHIFT=240 #@FRAMESHIFT@ # Frame shift in point (80=16000*0.005, 240 = 48000 * 0.005)
FREQWARP=0.42 #@FREQWARP@   # frequency warping factor
GAMMA=0 #@GAMMA@      # pole/zero weight for mel-generalized cepstral (MGC) analysis
MGCORDER=34 #@MGCORDER@   # order of MGC analysis
LNGAIN=1 #@LNGAIN@     # use logarithmic gain rather than linear gain
LOWERF0=110 #@LOWERF0@    # lower limit for f0 extraction (Hz)
UPPERF0=280 #@UPPERF0@    # upper limit for f0 extraction (Hz)

FRAMELEN=400
FFTLEN=2048; #512

# windows for calculating delta features
MGCWIN=win/mgc.win
LF0WIN=win/lf0.win
BAPWIN=win/bap.win
NMGCWIN=3 #@NMGCWIN@
NLF0WIN=3 #@NLF0WIN@
NBAPWIN=3 #@NBAPWIN@

                ap='ap/cmu_us_arctic_slt_a0021.ap';
                base=`basename ${ap} .ap`; \
                ####${X2X} +af ${ap} | ${BCP} +f -n 1024 -L   2 -s   0 -e    1 -S 0 >____part01_fr_ap.bin

    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L   2 -s   0 -e    1 -S 0 | ${AVERAGE} -l   2 > bap01; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L   3 -s   2 -e    4 -S 0 | ${AVERAGE} -l   3 > bap02; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L   4 -s   5 -e    8 -S 0 | ${AVERAGE} -l   4 > bap03; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L   4 -s   9 -e   12 -S 0 | ${AVERAGE} -l   4 > bap04; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L   4 -s  13 -e   16 -S 0 | ${AVERAGE} -l   4 > bap05; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L   5 -s  17 -e   21 -S 0 | ${AVERAGE} -l   5 > bap06; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L   5 -s  22 -e   26 -S 0 | ${AVERAGE} -l   5 > bap07; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L   6 -s  27 -e   32 -S 0 | ${AVERAGE} -l   6 > bap08; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L   6 -s  33 -e   38 -S 0 | ${AVERAGE} -l   6 > bap09; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L   7 -s  39 -e   45 -S 0 | ${AVERAGE} -l   7 > bap10; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L   8 -s  46 -e   53 -S 0 | ${AVERAGE} -l   8 > bap11; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L   9 -s  54 -e   62 -S 0 | ${AVERAGE} -l   9 > bap12; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L  10 -s  63 -e   72 -S 0 | ${AVERAGE} -l  10 > bap13; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L  12 -s  73 -e   84 -S 0 | ${AVERAGE} -l  12 > bap14; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L  14 -s  85 -e   98 -S 0 | ${AVERAGE} -l  14 > bap15; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L  16 -s  99 -e  114 -S 0 | ${AVERAGE} -l  16 > bap16; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L  19 -s 115 -e  133 -S 0 | ${AVERAGE} -l  19 > bap17; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L  24 -s 134 -e  157 -S 0 | ${AVERAGE} -l  24 > bap18; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L  29 -s 158 -e  186 -S 0 | ${AVERAGE} -l  29 > bap19; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L  37 -s 187 -e  223 -S 0 | ${AVERAGE} -l  37 > bap20; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L  49 -s 224 -e  272 -S 0 | ${AVERAGE} -l  49 > bap21; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L  68 -s 273 -e  340 -S 0 | ${AVERAGE} -l  68 > bap22; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L  99 -s 341 -e  439 -S 0 | ${AVERAGE} -l  99 > bap23; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L 160 -s 440 -e  599 -S 0 | ${AVERAGE} -l 160 > bap24; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L 300 -s 600 -e  899 -S 0 | ${AVERAGE} -l 300 > bap25; \
    ${X2X} +af ${ap} | ${BCP} +f -n 1024 -L 125 -s 900 -e 1024 -S 0 | ${AVERAGE} -l 125 > bap26; \
    ${MERGE} -s  0 -l  1 -L 1 bap01 bap02 | \
    ${MERGE} -s  2 -l  2 -L 1 bap03 | \
    ${MERGE} -s  3 -l  3 -L 1 bap04 | \
    ${MERGE} -s  4 -l  4 -L 1 bap05 | \
    ${MERGE} -s  5 -l  5 -L 1 bap06 | \
    ${MERGE} -s  6 -l  6 -L 1 bap07 | \
    ${MERGE} -s  7 -l  7 -L 1 bap08 | \
    ${MERGE} -s  8 -l  8 -L 1 bap09 | \
    ${MERGE} -s  9 -l  9 -L 1 bap10 | \
    ${MERGE} -s 10 -l 10 -L 1 bap11 | \
    ${MERGE} -s 11 -l 11 -L 1 bap12 | \
    ${MERGE} -s 12 -l 12 -L 1 bap13 | \
    ${MERGE} -s 13 -l 13 -L 1 bap14 | \
    ${MERGE} -s 14 -l 14 -L 1 bap15 | \
    ${MERGE} -s 15 -l 15 -L 1 bap16 | \
    ${MERGE} -s 16 -l 16 -L 1 bap17 | \
    ${MERGE} -s 17 -l 17 -L 1 bap18 | \
    ${MERGE} -s 18 -l 18 -L 1 bap19 | \
    ${MERGE} -s 19 -l 19 -L 1 bap20 | \
    ${MERGE} -s 20 -l 20 -L 1 bap21 | \
    ${MERGE} -s 21 -l 21 -L 1 bap22 | \
    ${MERGE} -s 22 -l 22 -L 1 bap23 | \
    ${MERGE} -s 23 -l 23 -L 1 bap24 | \
    ${MERGE} -s 24 -l 24 -L 1 bap25 | \
    ${MERGE} -s 25 -l 25 -L 1 bap26 > bap/${base}.bap; \
    if [ -n "`${NAN} bap/${base}.bap`" ]; then \
        echo " Failed to extract aperiodicity coefficients ${ap}"; \
        rm -f bap/${base}.bap; \
    fi; \

    exit;




        DFS=DFS;
        INTERPOLATE=INTERPOLATE;
                                 ap='ap/cmu_us_arctic_slt_a0021.ap';
        base=`basename ${ap} .ap`; \
          bap=cmu_us_arctic_slt_a0021.bap;

    ${BCP} +f -l 26 -L 1 -s  0 -e  0 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p   2 | ${DFS} -a 1 -1 > ${base}.ap01;
    ${BCP} +f -l 26 -L 1 -s  1 -e  1 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p   3 | ${DFS} -a 1 -1 > ${base}.ap02;
                                 ${BCP} +f -l 26 -L 1 -s  2 -e  2 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p   4 | ${DFS} -a 1 -1 > ${base}.ap03;
    ${BCP} +f -l 26 -L 1 -s  3 -e  3 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p   4 | ${DFS} -a 1 -1 > ${base}.ap04;
    ${BCP} +f -l 26 -L 1 -s  4 -e  4 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p   4 | ${DFS} -a 1 -1 > ${base}.ap05;
    ${BCP} +f -l 26 -L 1 -s  5 -e  5 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p   5 | ${DFS} -a 1 -1 > ${base}.ap06;
    ${BCP} +f -l 26 -L 1 -s  6 -e  6 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p   5 | ${DFS} -a 1 -1 > ${base}.ap07;
    ${BCP} +f -l 26 -L 1 -s  7 -e  7 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p   6 | ${DFS} -a 1 -1 > ${base}.ap08;
    ${BCP} +f -l 26 -L 1 -s  8 -e  8 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p   6 | ${DFS} -a 1 -1 > ${base}.ap09;
    ${BCP} +f -l 26 -L 1 -s  9 -e  9 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p   7 | ${DFS} -a 1 -1 > ${base}.ap10;
    ${BCP} +f -l 26 -L 1 -s 10 -e 10 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p   8 | ${DFS} -a 1 -1 > ${base}.ap11;
    ${BCP} +f -l 26 -L 1 -s 11 -e 11 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p   9 | ${DFS} -a 1 -1 > ${base}.ap12;
    ${BCP} +f -l 26 -L 1 -s 12 -e 12 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p  10 | ${DFS} -a 1 -1 > ${base}.ap13;
    ${BCP} +f -l 26 -L 1 -s 13 -e 13 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p  12 | ${DFS} -a 1 -1 > ${base}.ap14;
    ${BCP} +f -l 26 -L 1 -s 14 -e 14 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p  14 | ${DFS} -a 1 -1 > ${base}.ap15;
    ${BCP} +f -l 26 -L 1 -s 15 -e 15 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p  16 | ${DFS} -a 1 -1 > ${base}.ap16;
    ${BCP} +f -l 26 -L 1 -s 16 -e 16 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p  19 | ${DFS} -a 1 -1 > ${base}.ap17;
    ${BCP} +f -l 26 -L 1 -s 17 -e 17 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p  24 | ${DFS} -a 1 -1 > ${base}.ap18;
    ${BCP} +f -l 26 -L 1 -s 18 -e 18 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p  29 | ${DFS} -a 1 -1 > ${base}.ap19;
    ${BCP} +f -l 26 -L 1 -s 19 -e 19 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p  37 | ${DFS} -a 1 -1 > ${base}.ap20;
    ${BCP} +f -l 26 -L 1 -s 20 -e 20 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p  49 | ${DFS} -a 1 -1 > ${base}.ap21;
    ${BCP} +f -l 26 -L 1 -s 21 -e 21 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p  68 | ${DFS} -a 1 -1 > ${base}.ap22;
    ${BCP} +f -l 26 -L 1 -s 22 -e 22 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p  99 | ${DFS} -a 1 -1 > ${base}.ap23;
    ${BCP} +f -l 26 -L 1 -s 23 -e 23 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p 160 | ${DFS} -a 1 -1 > ${base}.ap24;
    ${BCP} +f -l 26 -L 1 -s 24 -e 24 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p 300 | ${DFS} -a 1 -1 > ${base}.ap25;
    ${BCP} +f -l 26 -L 1 -s 25 -e 25 -S 0 ${bap} | ${DFS} -b 1 -1 | ${INTERPOLATE} -p 125 | ${DFS} -a 1 -1 > ${base}.ap26;
    cat      ${base}.ap01  | \
    ${MERGE} -s   2 -l   2 -L   3 ${base}.ap02 | \
    ${MERGE} -s   5 -l   5 -L   4 ${base}.ap03 | \
    ${MERGE} -s   9 -l   9 -L   4 ${base}.ap04 | \
    ${MERGE} -s  13 -l  13 -L   4 ${base}.ap05 | \
    ${MERGE} -s  17 -l  17 -L   5 ${base}.ap06 | \
    ${MERGE} -s  22 -l  22 -L   5 ${base}.ap07 | \
    ${MERGE} -s  27 -l  27 -L   6 ${base}.ap08 | \
    ${MERGE} -s  33 -l  33 -L   6 ${base}.ap09 | \
    ${MERGE} -s  39 -l  39 -L   7 ${base}.ap10 | \
    ${MERGE} -s  46 -l  46 -L   8 ${base}.ap11 | \
    ${MERGE} -s  54 -l  54 -L   9 ${base}.ap12 | \
    ${MERGE} -s  63 -l  63 -L  10 ${base}.ap13 | \
    ${MERGE} -s  73 -l  73 -L  12 ${base}.ap14 | \
    ${MERGE} -s  85 -l  85 -L  14 ${base}.ap15 | \
    ${MERGE} -s  99 -l  99 -L  16 ${base}.ap16 | \
    ${MERGE} -s 115 -l 115 -L  19 ${base}.ap17 | \
    ${MERGE} -s 134 -l 134 -L  24 ${base}.ap18 | \
    ${MERGE} -s 158 -l 158 -L  29 ${base}.ap19 | \
    ${MERGE} -s 187 -l 187 -L  37 ${base}.ap20 | \
    ${MERGE} -s 224 -l 224 -L  49 ${base}.ap21 | \
    ${MERGE} -s 273 -l 273 -L  68 ${base}.ap22 | \
    ${MERGE} -s 341 -l 341 -L  99 ${base}.ap23 | \
    ${MERGE} -s 440 -l 440 -L 160 ${base}.ap24 | \
    ${MERGE} -s 600 -l 600 -L 300 ${base}.ap25 | \
    ${MERGE} -s 900 -l 900 -L 125 ${base}.ap26 > ${base}.ap;
           exit;

Богу. Это и есть реализация  принципа "туда-сюда".

0

4

Тренированный голос выдал на выходе hts_engine ровную линию.

Можно потратить время на чтение и изучение лога.

============================================================================
Start synthesizing waveforms using hts_engine at Mon Mar  9 23:24:46  2015
============================================================================

Synthesizing a speech waveform from ./data/labels/gen/alice01.lab using hts_engine...
.\voices\qst1\hts_engine.exe -td ./voices/qst1/ver001/tree-dur.inf -tf ./voices/qst1/ver001/tree-lf0.inf -tm ./voices/qst1/ver001/tree-mgc.inf  -md ./voices/qst1/ver001/dur.pdf -mf ./voices/qst1/ver001/lf0.pdf -mm ./voices/qst1/ver001/mgc.pdf  -dm ./voices/qst1/ver001/mgc.win1 -dm ./voices/qst1/ver001/mgc.win2 -dm ./voices/qst1/ver001/mgc.win3 -df ./voices/qst1/ver001/lf0.win1 -df ./voices/qst1/ver001/lf0.win2 -df ./voices/qst1/ver001/lf0.win3 -s 48000 -p 240 -a 0.42 -g 0 -l -b 0  -or ./gen/qst1/ver001/hts_engine/alice01.raw -ot ./gen/qst1/ver001/hts_engine/alice01.trace ./data/labels/gen/alice01.lab
done.

...
Есть другие тренированные голоса, например alan.

hts_engine.exe
каталог alan
alice01.lab  файл меток из каталога labels\gen

Теперь запускаем командный файл или пишем в командной строке:
hts_engine.exe -td ver001/tree-dur.inf -tf ver001/tree-lf0.inf -tm ver001/tree-mgc.inf  -md ver001/dur.pdf -mf ver001/lf0.pdf -mm ver001/mgc.pdf  -dm ver001/mgc.win1 -dm ver001/mgc.win2 -dm ver001/mgc.win3 -df ver001/lf0.win1 -df ver001/lf0.win2 -df ver001/lf0.win3 -s 48000 -p 240 -a 0.42 -g 0 -l -b 0  -ow ver001_alice01.wav -ot ver001_alice01.trace alice01.lab

Можно услышать человеческую речь, ну, почти человеческую. )

Если изменить параметры, то голос алана станет нормальным.
hts_engine.exe -td ver001/tree-dur.inf -tf ver001/tree-lf0.inf -tm ver001/tree-mgc.inf  -md ver001/dur.pdf -mf ver001/lf0.pdf -mm ver001/mgc.pdf  -dm ver001/mgc.win1 -dm ver001/mgc.win2 -dm ver001/mgc.win3 -df ver001/lf0.win1 -df ver001/lf0.win2 -df ver001/lf0.win3 -s 16000 -p 80 -a 0.42 -g 0 -l -b 0  -ow ver001_alice01.wav -ot ver001_alice01.trace alice01.lab

Файл трассы содержит лог работы hts_engine.exe.

Формат команды запуска hts_engine.exe:
hts_engine.exe   <файлы голоса> <параметры>  infile.lab

Отредактировано Inprj21 (2015-03-10 08:34:43)

0


Вы здесь » Форум ReidS Soft » Flite+hts_engine » Getting started with STRAIGHT in command mode