Functions for Concatenating Units
- Table of Contents
- us_unit_concat
- us_get_copy_wave
- us_unit_raw_concat
- us_energy_normalise
us_unit_concat
sat_as_is void us_unit_concat ( EST_Utterance &utt, float window_factor, const EST_String &window_name, bool no_waveform=false) Iterate through the Unit relation and create theSourceCeof relation, which contains a series of windowed frames of speech and a track of pitch-synchronous coefficients.
SourceCoef contains a single item with two features, coefs and frame
coefs'value is a track with all the concatenated pitchmarks and coefficients from the units.
us_unit_concat is where the pitch synchronous windowing of the frames in each Unit is performed and the result of this is stored as the value of frame
Require:. Unit
Provide:. SourceCoef
Parameters utt utterance
window_factor This specifies how large the analysis window is in relation to the local pitch period. A value of 1.0 is often used as this means each frame approximately extends from the previous pitch mark to the next.
window_name This specifies the type of window used. "hanning" is standard but any window type available from the signal processing library can be used.
no_waveform if this is set to true, only the coefficients are copied into SourceCoef - no waveform analysis is performed.
us_get_copy_wave
void us_get_copy_wave ( EST_Utterance &utt, EST_Wave &source_sig, EST_Track &source_pm, EST_Relation &source_seg, float window_factor, const EST_String &window_name) This function provides the setup for copy resynthesis. In copy resynthesis, a natural waveform is used as the source speech for synthesis rather than diphones or other concatenated units. This is often useful for testing a prosody module or for altering the pitch or duration of a natural waveform for an experiment. (As such, this function should really be thought of as a very simple unit selection module)
In addition to the speech waveform itself, the function requires a set of pitchmarks in the standard form, and a set of labels which mark segment boundaries. The Segment relation must already exist in the utterance prior to calling this function.
First, the function creates aUnit relation with a single item containing the waveform and the pitchmarks. Next it adds a set of source_end features to each item in the Segment relation. It does this by calculating a mapping between the Segment relation and the input labels. This mapping is performed by dynamic programming, as often the two sets of labels don't match exactly.
The final result, therefore is a Unit relation and Segment relation with source_end features. As this is exactly the same output of the standard concantenative synthesis modules, from here on the utterance can be processed as if the units were from a genuine synthesizer.
Copy synthesis itself can be performed by ....
Require:. Segment
Provide:. Unit
Parameters utt utterance
source_sig waveform
source_pm pitchmarks belonging to waveform
source_seg set of items with end times referring to points in the waveform
window_factor This specifies how large the analysis window is in relation to the local pitch period. A value of 1.0 is often used as this means each frame approximately extends from the previous pitch mark to the next.
window_name This specifies the type of window used. "hanning" is standard but any window type available from the signal processing library can be used.
us_unit_raw_concat
void us_unit_raw_concat ( EST_Utterance &utt) This function produces a waveform from the Unit relation without prosodic modification. In effect, this function simply concatenates the waveform parts of the units in the unit relation. An overlap add operation is performed at unit boundaries so that waveform discontinuities don't occur.
us_energy_normalise
void us_energy_normalise ( EST_Relation &unit) Items in the Unit relation can take an optional flagenergy_factor, which scales the amplitude of the unit waveform. This is useful because units often have different energy levels due to different recording circumstances. An energy_factor of 1.0 leaves the waveform unchanged.
 off the f0 contour at time
off the f0 contour at time  , calculating the local pitch period
, calculating the local pitch period
 , and placing a pitchmark at time
, and placing a pitchmark at time  . The
process is then repeated by reading the F0 value at this new point and
so on.
. The
process is then repeated by reading the F0 value at this new point and
so on.