Laboratory Assignment
MSc Introductory Module (Part I)
Assignment Instructions
You should work in groups of three (however, there is no problem if anyone wants to work in group of two
or individually). EACH GROUP should record the outcomes of their work in ONE lab-report and store
any required program code and files.
Tha lab-assignment report can be written in LaTeX or MS Word. The front page of the report should
contain the module title and IDs of all persons working in the group as well as the effort of each person
(standard for a person would be 100%, i.e., full effort). The report should be written using font size of
11pt. Matlab code listings should be included in appendices of the report and should be of 9pt font size.
Submission of the assignment (as well as all the required files) is through Canvas – under the ‘MSc
Introductory Module for Computer Engineering’, find the assignment ‘Lab-Assignment Submission’. The
report should be submitted in .pdf file format. Make sure you attach all the files required and make your
submission.
Each group should make a single submission.
The usual penalty of 5% per day will apply to all late submissions.
Plagiarism
Plagiarism will not be tolerated. It is the act of a Student claiming as their own, intentionally or by
omission, work which was not done by that Student. Plagiarism also includes a Student deliberately
claiming to have done work submitted by the Student for assessment which was never undertaken by that
Student, including self-plagiarism and the other breaches. Sanctions of a plagiarism include the Student
failing the Programme of study.
1
1 Introduction
This assignment will test your knowledge and understanding of the following topics: programming in
Matlab (loops, vectors, matrices, functions, reading/writing to files), discrete Fourier transform, digital
filtering, modelling of data using Gaussian probability density function (PDF). This will be done through
analysis of a given set of audio data.
2 The Data
The data you will use consists of audio data and accompanying text data (referred to as labels). The data
is part of the TIMIT dataset1
, which has been used widely for research in speech processing. You will use
70 files (audio and text), which are arranged in sub-folders.
The audio data contain recordings of speech, sampled at 8 kHz sampling frequency. The files are in
Microsoft ‘.wav’ format, which can be read in Matlab using the function ‘audioread’.
Each audio wav-file is accompanied with a text file, referred to as label file (.lab file). Each .lab label
file contains three columns – the third column is the list of phonemes (i.e., types of speech sounds) that
are spoken in the corresponding wav-file; the first and second column give the start and end times of each
phoneme, respectively. The times are in 100 ns (i.e., 1409375 corresponds to 140.9375 ms). An example of
the label file is given in Figure 1.
Figure 1: An example of label file (part for audio ‘MCPM0/SA1.wav’).
You are also given a text file ‘listData.txt’ that contains the list of all the filenames (without the file
extension) of the data.
3 Assignment task
Your Matlab program should be a text-based menu-driven program that has the following options for user
to choose by pressing the speciffied letter (‘a’, ‘b’, ‘c’, ‘d’, or ‘e’):
(a) Perform FIR filtering
(b) Extract signal segments
(c) Calculate DFT and energy for low/high frequency regions
(d) Modelling of energy values using Gaussian PDFs
(e) Exit the program
The definition of all of these options is given in the following subsections.
1
J. Garofolo, “TIMIT: acoustic-phonetic continuous speech corpus,” Linguistic Data Consortium, 1993.
2
3.1 Option (a): Perform FIR filtering
This option should read the text file ‘listData.txt’ and then in a loop load one by one the original wav-files
from folder ‘wavOrig’, perform filtering on each file and store the output signals into wav-files with the
same name but into a new folder named ‘wavFilt’. The new created wav-files should be exactly of the
same length as the original wav-files.
Your program should implement FIR filtering of the audio signal through the relation of the output
y(n) and input x(n) sample values. The filter is defined by its impulse response h(n) given in Eq. 1.
h(n) = {h(0), h(1), . . . , h(6)}
= {−0.1, 0.3, 0.5, 0.5, 0.5, 0.3, −0.1}.
(1)
You should assume that values of samples in the input signal x(n) for time n ≤ 0 are zero. Your program
should also produce a figure of the magnitude frequency characteristic of the filter.
You are NOT ALLOWED to use any Matlab ready-made functions to perform the filtering and obtain
the filter frequency characteristic (such as, ‘filter’, ‘conv’, ‘freqz’).
Deliverables:
• Include in the report: Figure of the magnitude frequency characteristic of the filter.
• Attach to your submission: A zip file of the folder ‘wavFilt’, containing all your created wav-files.
3.2 Option (b): Extraction of signal segment
This option should extract from the audio wav-files a specified part of the signal corresponding to the
phoneme ‘s’ and phoneme ‘aa’ and store the extracted signals into matrices.
For each wav-file, you will need to read the corresponding label file and find all the occurrences of the
the phonemes ‘s’ and ‘aa’. For a given occurrence of the phoneme, you should extract from the wav-file a
segment of 20 ms of the signal around the centre of the phoneme, i.e., the start time (timeSegStart) and
end time (timeSegEnd) of the segment to be extracted shoud be calculated (in ms) as:
timeSegStart = timeP hStart + (timeP hEnd − timeP hStart)/2 − 10
timeSegEnd = timeP hStart + (timeP hEnd − timeP hStart)/2 − 10,
(2)
where timeP hStart and timeP hEnd are, respectively, the start and end times of the phoneme as found
in the label file (but converted to ms).
The signal segments of phoneme ‘s’ and phoneme ‘aa’ should be extracted from all the files and
separately from ‘wavOrig’ data and ‘wavFiltered’ data and stored into arrays named: segOrig phS,
segF ilt phS, segOrig phAA and segF ilt phAA. Each of these arrays should be of the size num ×
nSamples, where num is the number of occurrences of the particular phoneme (‘s’ or ‘aa’) and nSamples
is the number of samples in a segment of signal, i.e., each row in the arrays corresponds to a phoneme
occurrence and each column to the sample index. After processing all files, store the arrays into a mat-file
called ‘segAllData.mat’. Note that the number of occurrences of these phonemes in each file varies (and
sometimes there is none).
Deliverables:
• Include in the report: Figures of the extracted signal segment (waveform) of the first occurrence of
the phoneme ‘s’ and phoneme ‘aa’ in the files: ‘wavOrig/MDPK0/SA1.wav’ and
‘wavFilt/MDPK0/SA1.wav’.
• Attach to your submission: The mat-file ‘segAllData.mat’, containing the arrays: segOrig phS,
segF ilt phS, segOrig phAA and segF ilt phAA.
3
3.3 Option (c): Calculate DFT and energy for low/high frequency regions
This option should perform DFT on the signal segments and then calculate the average energy for lowfrequency
region (0-2 kHz) and for high-frequency region (2-4 kHz) and then convert these to decibels
(dB). These calculations should be applied to each extracted signal segment.
For each phoneme (‘s’ and ‘aa’) and data conditions (‘orig’ and ‘filt’), it should process all the extracted
signal segments previously stored in the arrays (segOrig phS, segF ilt phS, segOrig phAA or
segF ilt phAA) and store the two calculated energy values (in dB) in 2D arrays, named accordingly:
enLF andHF orig phS, enLF andHF f ilt phS, enLF andHF orig phAA, and enLF andHF f ilt phAA.
Each of these arrays should be of the size num × 2, where num is the number of occurrences of the particular
phoneme (‘s’ or ‘aa’) – each row in the arrays corresponds to a phoneme occurrence and each column
to the low-frequency and high-frequency energy values.
Deliverables:
Include in the report:
• Figures of the magnitude spectrum for the first occurrence of the phoneme ‘s’ and ‘aa’ in the files:
‘wavOrig/MDPK0/SA1.wav’ and ‘wavFilt/MDPK0/SA1.wav’.
• Histograms of the low-frequency and high-frequency energy values for the phoneme ‘s’ and ‘aa’
in each conditions (‘orig’ and ‘filt’), i.e., histograms of data in variables: enLF andHF orig phS,
enLF andHF f ilt phS, enLF andHF orig phAA, and enLF andHF f ilt phAA.
3.4 Option (d): Modelling of energy values using Gaussian PDFs
This option should perform modelling of the low-frequency and high-frequency energy values separately
for each phoneme (‘s’ and ‘aa’) and for each condition (‘orig’ and ‘filt’) using Gaussian PDFs.
Deliverables:
Include in the report (for each phoneme and each conditions):
• A table with the values of the parameters of the Gaussian PDFs modelling each data.
• Discuss the appropriateness of modelling using Gaussian PDFs.
3.5 Option (e): Exit the program
This option should exit the program.
4 Report and Marking criteria
Attach with your submission: report, files as requested in each of the tasks, .zip file containg all your
m-files.
Marking will be according to the following criteria:
• Correctness of operation and completeness of part (a) [ 15 points ]
• Correctness of operation and completeness of part (b) [ 20 points ]
• Correctness of operation and completeness of part (c) [ 20 points ]
• Correctness of operation and completeness of part (d) [ 15 points ]
• Matlab programming – demonstration of suitable use of programming concepts and code efficiency
[ 15 points ]
• Quality of report – formatting, English, figures with labels, etc [ 15 points ]
END
4
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。