Schedule for STAT 663/ CSI 773

The tentative schedule for Fall 2009

The instructor reserves the right to improve and adjust the ambitious schedule.

To save assignment files in your chosen directory, right click on the link and use save link as.

Key:  CL= Cleveland's Visualizing Data, E&H = Everitt and Hothorn's A Handbook of Statistical Analyses Using R, V&R = Venables and Ripley's Modern Applied Statistics with S

Week
Topics
Readings, References and Notes
Assignments

1- Sept 1

Getting Started
Syllabus: Texts
Assignments
Quizzes and MidTerm Exams
Redesign Project
Final Project
Background: Instructor description
Student descriptions
Logistics: Class Web Site
Student Web Pages
Topics:

Data exploration versus probablistic inference

R Introduction
  Tufte's The Visual Display Of Quantitative Data
Sample of Graphics CCmaps
Linked micromaps
CrystalVision

CL: p1-20

Note E&H is now suggested not required
E&H: Chapter 1

Create a web site
Example web page
Email the link to dcarr@gmu.edu

R Installation and Resources

Tufte's ask E.T. page

a1_R commands
a1_Graphics building blocks
a1_Looping and Matrices

Create a web site

Go to the web site below.
Experiment with NameVoyager
Click on the Tab for NameMapper
experiment with this, especially e the
timeline view under the timeline tab

Get a one sreen shot to include
with your assignment.

BabyNameWizard

Only turn in the graphics build blocks
assignment and the screen shot.

Due: Sept 15

2- Sept 8
Foundations of Graph Design
Topics: Representation systems and communication goals
  Psychophysics laws
  Visual processing channels
  Textons, discrimination, and perceptual grouping
  Perceptual accuracy of extraction
  Memory limits and computational weaknesses
  Visual layering, sorting, reference simplification
Redesigns Converting tables to plots
Reasoning Magical Thinking and Data Exploration
  Row-labeled bar plot
R Panel Layouts, Writing Functions

CL: 21-41 QQplots, Distributionl Comparisons, Basic Graphics

Lecture Notes:

Perception Cognition and Graphics Guidelines

Psychophysics and Eyes

Change Blindness ppt


panelFunctions

a2_Panel Layouts
a2_Lecture examples
a2_Writing functions and bar plots
a2_Lattice graphiics

Due: Sept 22

3-Sept 15

Describing and Comparing
Univariate Distributions

Topics: Quantile and CDF Plots
 

Boxplots

  density plots
 

transformations

  QQplots and interpretation
  MD and SL plots
  Paired comparison versus distribution comparison
Tools: Color Brewer
Redesign  
R

Denisty Quantiles, Boxplots, QQplots,
Parallel quantiles and cumulative percents

CL: p42-85 Transformation
CL: p86-109 (Skip For the record P:90-91)
E&H: Density Estimation p109-130

Data and Data tables
Univarate plots

gridPlotFunctions

a3_Density plots
a3_Quantiles
a3_Qqplots

Tretinoin.csv
a3_Boxplots

a3_dotplots

Due: Sept 29

4- Sept 22

Bivariate Data
Scatterplots and Smooths,
Scatterplot Matrices and Scagnostics

Topics: Scatterplots
Smoothes y=f(x)

Scatterplot Matrices

Linked Brushing
CrystalVision
(Scagnostics)
Parallel Coordinates
Redesign Washington Post Example
  Converting tables to plots (from week 2)
R Color control, smoothing, etc.

CL: 3.2-3.6, Loess Fitting p91-127

Data for your interest
CV_BodyCenteredCube.txt

On hour of network session show in class
24 hours: 307235 cases and 12 variables
GreenNet1May02hr0.csv
a4_CV_LANL Network_Sessions

Assignment topic for next year
a4_Scagnostics

hamster.csv
a4_Smoothes y=f(x)

a4 Scatterplot Matrix

gemIdGene.csv
genePNAS.txt
genePNASlines.txt
CrystalVisionDemo
a4_CV Genes

 

 

Due: Oct. 6

5- Sept 29

Bivariate Densites
and Trivariate Data

Topics Hexagon binning
  Boxplots binning and maps
  Bivariate density estimates
  Contours and surfaces
  Smoothes z=f(x,y)
  3-D scatterplots
  Stereo Construction - no graphics :-(
  Rotation
  Conditioned views and differences
Reasoning Common human flaws
Redesign Examples from Canada

Notice:

Exam in Week 8

 

CL: 3.7-3.9 Slicing, Discrete Values, Transformations p128-145:

Many lecture comments are in the assignments.

 

a5_3-D scatterplots

a5_ conditioned views and differences

 

 

 

hbinFunctions
a5 Bivariate point density

a5_Variables z=f(x,y)
a5_Rotate

a5_ggplot2

Due: Oct. 13

6- Oct 6

Maps and
Geospatially Indexed Statistics

Topics: Map projections
Map symbolization
Choropleth maps
Map legends
Conditioned choropleth maps
Linked micromap maps
Redesign  
Reasoning

Great arc distance
Projections and data density
Modifiable area unit problem

Download Linked Micromap Software from NCI
Works with your own shapefiles and data.
Includes pdf tutorials.

Linked Micromaps over the web at NCI

MidTerm Exam 1 Scope and Selected Answers

Mid Term Exam 1 Sample

 

stateVisibilityBorders
nationVisibilityBorders
stateNamesFips
stateUnemployment95

panelLayoutObjects.txt
stateMicromapObjects.txt

Statelungwm50_69.txt
Statelungwf50_69.txt
Statelungwm70_94.txt
Statelungwf70_94.txt
Countylungwm70_94.txt
Countylungwf70_94.txt

a6_Linked Micromaps
a6 stateMicromaps
a6_choropleth map

a6_CCmaps
CCmaps web page

Due: Oct. 27

No Class

Oct 13

Monday's Classes on Meet Tuesday
   

7- Oct 20

Smoothing, Hots Spots
and Comparing Maps

Topics: Geospatial Smoothing
Scan Statistics for hot spots
Showing map differences explicitly
Satellite images
Comparative Indexed Maps Series
Notice
Redesigns Due Week 9

CL: 3.10 Bivariate Distributions p146-151
CL: 3.11 Time Series p152-169

Comments on Map Smoothing

 

Graph Redesign Guidelines

 

hsaCentroids.csv
hsaPoly.csv
hsaWMColonMort.csv
usGrid90Cell.csv

usGrid90States.csv
usGrid90Centroids.csv
usGrid90Clip.csv
albersLegend.r
a7_Map smoothes z=f(x,y)

ndviSept82.txt
ndviSept92.txt

a7_satelliteImage

 

Due: Nov 3

8- Oct 27

 

Regression

Topics:

Multiple Linear Regression

Logistic Regression
Regression diagnostics
(LARS and Lasso LARS)
Exam MidTerm Exam
Video (Buja and Stuetzle: Focusing and Linking)

CL:4.6-4.11 Level Plots/Stereo 228-269
E&H: Multipe Linear Regression
E&H: Logistic Regression

Modeling and Visualization
Regression Specification and Diagnostics

 

Redesign Order 2009

abrasionLoss.txt

a8_Regression and diagnostics.r

a8_rgl 3D surface residuals spanning tree

a8_Logistic regression

 

Due: Nov 10

9-Nov 3

Redesign
Presentations

Redesign
Presentation

 

Final Project Guidelines

Next year assignment additions:

Cancer screening tables
Error rate tables

 

Due: Nov 17

10- Nov 10

 

Clustering and Graphics

Topics: Scaling and transformations
  Dissimilarities and distance
  Clustering Methods
  Graphics for cluster evaluation
  Finding Lower Dimensional Clusters
  Dendrograms and heat maps
  Hexagon cluster graphs
  Clustering for Compression
 

Showing Earth Grid Cell Clusters

E&H: Cluster Analysis p243-257

Notes: Distances and Dissimilarities

Notes: Clustering Comments

Showing Earth Cell Clusters
This talk is about is about visualizing
multivariate multi-altitude atmosphere data
for the whole earth. NASA has addressed the
hard processing steps involving many terabytes of
from the Atmospheric Infrared Sounders (AIRS)
This is collaborative work with Amy Braverman JPL

gemIdGene.csv
heptathlon

hypertree

a10_scaling and transformations
a10_dissimilarities and clustering
a10_cluster subspaces

 

Due: Nov 24

11- Nov 17

Dimension and Resolution Reduction
Projection and Sectioning

Topics: Scaling revisited
Resolution reduction
Multidimensional scaling
Principal components
SVD and matrix approximation
  Seriation and appearance
Projection, rotations and tours
Projection, sections and dimensionality
Notes: Seeing Constraints (PDF file)

Notes are embedded in the assignments.

 

Next year candidate assignment
a13_som

 


geneMstRowSubs
geneMstColSubs
centeredMstRowOrder
centeredMstColOrder
svdM9MstRowSubs
svdM9MstColSubs

ColorMatView
ColorCorMatBars

a11_RowAndColumnOrderingForGraphics
a11_dimensionReduction


Due: Dec 1

12- Nov 24

Data Mining and Graphics:
CART, Random Forest, Rule Fit

Topics: Transformations
Classification and regression trees: rpart()
Random forests
Rule fit (brief commentsr)
Notice Projects and Presentations are the Final Exam

E&H Recursive Partitioning p131-142

The assignments include lots of descriptions

Possible assignments for next year
a13_Discriminate Analysis

a13_Support Vector Machines

a12_Recursive Partitioning
a12_Random Forests

Due: Dec 8

13- Dec 1

Layouts and Layering,
Coordinate Construction and Glyphs

Topics: Layering and grouping
Visual layering and layer controls
Space filling layouts
Cluster layouts and spring models
Hexagon cluster graphs
Coordinant construction for statistics indexed by letter sequences
Videos: Escher (if time)
Failure to Communicate (if I can find it)
CL: 6. Multiway Data 302-340

Paper Length
Individuals should target 20 pages
Groups of two should target 30 pages


CPI data
panelEdge3d
spaceFillLayouts

a13_Embellish
a13_Glyphs
a13_Cluster layouts

Due: Dec 15


14- Dec 8

Coordinate Construction
Cognitive Simplicity Reward Functions
Coherence Analysis

Topics: Coordinant construction for statistics indexed by letter sequences
Variable encoding coordinate construction
Measures of Simplictiy
Simplicity indiced higer order measurs
Coordinant construction for statistics indexed by letter sequences
Glyph encodings
Videos: Escher (if time)
Failure to Communicate (if I can find it)
CL: 6. Multiway Data 302-340

 

Paper Length
Individuals should target 20 pages
Groups of two should target 30 pages



No Assignments

Work on Final Project


Final- Dec 15
Project Presentations
All Final Project Papers Due