Laboratory #4
Data Analysis and Pattern Recognition

1. Pattern Recognition Toolbox

1.1 Introduction

The experiments in Lab 4 use a Matlab program by Michael Heinz called the Pattern Recognition and Feature Extraction Toolbox. (The manual provides a systematic description of all of its features.) The purpose of this part of Lab 4 is to familiarize you with the program and to review pattern classification concepts. Be aware of the following facts about the program:
  1. It is designed for 2-category classification.

  2. It is designed for 2-dimensional (x-y) input data. In the d-by-2 data matrix, each column is a feature and each row is a different data point.

  3. It reads data from files, not MATLAB variables; whenever you want to create your own data, you must use data as a variable name and save it to a file with the MATLAB command save <filename> data .

  4. It provides several tools, but you can use only one tool at a time.

1.2 Getting Started

0. Make a local copy of the cs436/Lab4 directory in your own file area.

1. Launch Matlab.

2. Connect to your directory (e.g., at the Matlab prompt, type cd my-dir )

3. Type lab4

4. Look at Command-Bar Menus. You should see:
Graph -> Lab Data ----- Purpose: Plot 1-d data from 1 or 2 files

Analysis -> Envelope ----- Purpose: Extract waveform features (not for Lab 4)
Analysis -> Covariance ----- Purpose: Do statistical analysis
Analysis -> DFT ----- Purpose: Extract spectral features (not for Lab 2)

Cluster -> k-Means ----- Purpose: Apply k-means procedure
Cluster -> Nearest Neighbor ----- Purpose: Apply nearest-neighbor procedure

Link -> Real Time Links ----- Purpose: Acquire real-time data (not for Lab 4)
Exit -> Close HCI Lab ----- Purpose: The clean way to quit

1.3 Exploring the Graph tool

1. At MATLAB prompt, generate and save a sine wave by typing
data = sin(0:0.1:300)';
save temp data;
2. Reselect the HCI Lab window

3. Choose Graph -> Lab Data
(A window should appear along with a new "Options" menu.)

4. Choose Options -> Plot Data -> Single File
A popup menu will ask for the file name; type temp and select "Continue"
A popup menu will show the number of data points and will ask for the range of points; just select "Continue"

5. Repeat 4, but this time make the range be 1 to 300
6. Repeat 4, but this time choose Options -> Plot Data -> Two Files; type temp as the name of each file; use your choice for the data ranges -- but keep them in the legal range!
7. Choose Options -> Exit

1.4 Exploring the Analysis -> Covariance tool

1. Reselect the HCI Lab window

2. Choose Analysis -> Covariance

3. Choose Options -> Covariance Example
(You should see a plot with red and blue data points, asterisks at the means; note the different scales for the two axes.)

4. If the box at the bottom right does NOT say "Click for Euclidean Distances", click on "Toggle Distance Type" at the top center so that it does.

5. Position the cursor at a point half-way between the means and click. (You should see the roughly equal distances; press any key to dismiss the message.)
6. Position the cursor at the top-most red point in Cluster 1 and click.
7. On the upper-left pulldown (saying "Hide Contours"), choose "Euclidean Contours"; then select "Cluster 1". You should see ellipses.
8. On the upper-left pulldown (saying "Euclidean Contours"), choose "Hide Contours".

9. On the upper-right pulldown (saying "Hide Separator"), choose "Euclidean Separator" (You should see the decision boundary based on Euclidean distance.)
10. Repeat 9, but choose "Mahalanobis Separator"
11. Click on "Toggle Distance Type" to get Mahalanobis distances. By positioning the cursor and clicking:
12. On the upper-left pulldown, choose "Mahalanobis Contours"; then select "Cluster 1".
13. Under "Options" choose "Exit"

1.4 Exploring the Cluster -> K-means tool

1. Choose Cluster -> K-means

2. Choose Options -> K-means Example 2
3. Using the popup menu, select 2 clusters and "Continue"; watch as the K-means procedure clusters the data
4. Repeat 2 and 3 using 4 clusters
5. Choose Options -> K-means Example 1
6. Select 2 clusters and "Continue"; watch as the K-means procedure clusters the data
7. Repeat 5 and 6 using 4 clusters
8. Choose Options -> Exit

9. Choose Cluster -> Nearest Neighbor

10. Choose Options -> Nearest Neighbors Example 1

11. Using the popup menu, select t=1.0 and "Continue"; watch as the Nearest-Neighbor procedure clusters the data
12. Repeat 10 and 11 using t=0.6
13. Choose Options -> Nearest Neighbors Example 2

14. Try all 4 values of t
15. Choose Options -> Exit, and then choose Exit -> HCI Lab.

On to Lab # 4, Part b: Synthetic Data

Up to Lab #4