**
POLS 6482 ADVANCED MULTIVARIATE STATISTICS
Tenth Assignment
Due 12 November 2001**

- This problem is a continuation of our analysis of the 105
^{th}and 106^{th}congressional district data that we have analyzed in homeworks 3, 4, 5, 6, and 8. I made some additional corrections to the file so download the new**Stata**file below:

105th and 106th Congressional District Data (HDMG106.DTA)

- Download the following
**Excel**file:

2000 Census Data For Congressional Districts (CD2000CENSUS.XLS)

and merge it into HDMG106.DTA. (Use the following variable names and definitions (note that the variable*Note that this will require some ingenuity on your part*!)**district**also appears in the**Excel**file but it will be one of the variables you will need to use to do the merge):

Do the**statenmlong**str20 %20s Name of state (long)**total_pop**double %10.0g Total population of CD 2000**white00**float %9.0g Percent White 2000**black00**float %9.0g Percent Black 2000**asian00**float %9.0g Percent Asian 2000**hispanic00**float %9.0g Percent Hispanic 2000**owner00**float %9.0g Percent Owner-Occupied Housing Units 2000**d**and**summ**commands, and report the results.

- In
**STATA**run the regressions:

**regress bush00 black00 south hispanic00 income owner00 dwnom1n dwnom2n dole96**

**regress gore00 black00 south hispanic00 income owner00 dwnom1n dwnom2n clint96**

Analyze these two regressions. What do you think accounts for the differences between them. Be specific.

- In
**STATA**compute the correlation matrix for the independent variables:

**correlate black00 hispanic00 income owner00 dwnom1n dwnom2n**

Examine the entries of the correlation matrix. Do you see anything that strikes you as odd? Be explicit.

- To obtain the eigenvalues and eigenvectors of the correlation matrix, use the
**STATA**command:

**factor black00 hispanic00 income owner00 dwnom1n dwnom2n, pc**

To obtain a graph of the eigenvalues, use the**STATA**command:

**greigen, xlabel(1,2,3,4,5,6)**

Does this graph lead you to believe that there is a significant problem with multicollinearity with these independent variables? Why? Why not?

- Download the following
- This problem deals with congressional elections. Below you will find a dataset
that includes variables created by David Lublin and Gary Jacobson. The observations are
congressional districts for the 1960 to 1994 period. Some of the data are missing so when
you run regressions you may not have the entire time period. To bring up the dataset
in
**Stata**you will have to increase the default memory size. To do this, use the command:

**set mem 20m**

which allocates 20 meg of memory for**Stata**to work with.

Congressional Elections Data From Lublin and Jacobson (Stata Dataset)

Download the dataset and bring it up in**Stata**. If you issue the**d**command you will see:

If you issue the**. d Contains data from D:\statadat\lublin5.dta obs: 7,832 vars: 39 1 Nov 2001 11:35 size: 1,057,320 (98.7% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- year int %8.0g year congress byte %8.0g congress (87-104) icpsrid long %12.0g icpsr id # icpsrst byte %8.0g icpsr state code cdist1 byte %8.0g cong. district (p&r) statenm str7 %9s state name cdist2 byte %8.0g cong. district (lublin) dempct float %9.0g demo. % two party vote blkpct float %9.0g black percent of pop. whpct float %9.0g white percent of pop. forpct double %10.0g foreign born % of pop. south byte %8.0g south (1=confederacy + KY +OK, 0=north) incomewh float %9.0g white median family income incomebl long %12.0g black median family income hs25 float %9.0g percent 25 and older completing high school or more college float %9.0g percent 25 or older completed 4 yrs college or more party1 int %8.0g party code (100=Dem, 200=Rep) blackrep byte %8.0g blackrep =1 if black representative, 0 otherwise latinorp byte %8.0g latinorp=1 if mexican, 2=PR, 3=Cuban, 0 otherwise womanrep byte %8.0g woman representative (1=woman, 0=man) incumb1 byte %8.0g incumbency (0=repub, 1=demo., 2=open) votesd long %12.0g number of votes for democrat votesr long %12.0g number votes for republican demvshr float %9.0g democrats share two-party vote whowon byte %8.0g 0 = repub won, 1= demo. won, 99=3rd party won incshr float %9.0g incumbents share 2-party vote, 99.9=unopposed incshrl float %9.0g incumbents share 2-party vote last elect, 99.9=unpposed redist byte %8.0g redistricted: 0=district unchange, 1=re-districting incumbst byte %8.0g incumbency status: 0 = republican incumbent 1 = democratic incumbent 2 = open seat formerly held by democrat 3 = open seat formerly held by republican 4 = open seat, new (from redistricting) 5 = two incumbents (from redistricting) 9 = third-party incumbent challeng byte %8.0g challenger quality 0 = challenger has not held elective office 1 = challenger has held elective office 2 = only Democratic candidate for open seat has held office 3 = only Republican candidate for open seat has held office 4 = both candidates for open seat have held office 5 = no challenger 6 = no Democrat candidate (open) 7 = no Republican candidate (open) challenh byte %8.0g challenger misc. information 0 = Nothing special (ignore) 1 = At Large or multi-candidate race 2 = unopposed 3 = incumbent switched parties since last election 4 = challenger was state legislator 5 = only Democrat was state legislator (open seat) 6 = only Republican was state legislator (open seat) 7 = both candidates for open seat were state legislators 8 = challenger is former U.S. Representative 9 = odd race, third party; in general, DO NOT USE icpsrid2 long %12.0g icpsr id number party2 int %8.0g party id (100=Dem, 200=Repub) name str11 %11s member name dwnom1 float %9.0g dwnominate 1st dimension dwnom2 float %9.0g dwnominate 2nd dimension (multiply by .3) partynm str13 %13s name of political party xincome long %12.0g median family income xhispct float %9.0g percent hispanic ------------------------------------------------------------------------------- Sorted by:****summ**command you will see:

**. summ Variable | Obs Mean Std. Dev. Min Max -------------+----------------------------------------------------- year | 7832 1976.996 10.37915 1960 1994 congress | 7832 95.49783 5.189574 87 104 icpsrid | 7832 12325.2 7208.363 2 95120 icpsrst | 7832 36.75447 21.00158 1 82 cdist1 | 7832 9.979443 10.88324 1 99 statenm | 0 cdist2 | 7832 9.566394 9.151734 1 52 dempct | 7832 56.98605 23.56704 0 100 blkpct | 7595 11.12508 14.51647 .0194025 95.50033 whpct | 7595 85.85531 15.74295 3.862633 99.89686 forpct | 6723 5.732316 6.60551 .116483 58.52188 south | 7832 .2858784 .4518606 0 1 incomewh | 5859 17896.74 11820.14 2088.375 78717 incomebl | 5856 12378.12 8885.767 1213 66320 hs25 | 6723 57.11782 15.5113 14.8 92.3 college | 6723 12.88006 7.004294 1.9 51.4 party1 | 7830 140.3649 49.22029 100 329 blackrep | 7832 .0390705 .1937751 0 1 latinorp | 7832 .020429 .1716504 0 3 womanrep | 7832 .046859 .2113504 0 1 incumb1 | 7817 1.242548 .6364613 0 3 votesd | 6382 83320.55 53081.79 0 1872351 votesr | 6443 71985.16 57503.79 0 1786018 demvshr | 7832 57.00632 23.59968 0 100 whowon | 7832 .601762 .525787 0 9 incshr | 7832 71.56447 18.54318 20.6 99.999 incshrl | 7832 69.86711 17.08056 22.1 99.999 redist | 7832 .2893258 .4534784 0 1 incumbst | 7832 .8476762 .8639607 0 9 challeng | 7832 1.135981 1.803747 0 9 challenh | 7832 1.124362 2.017193 0 9 icpsrid2 | 7832 12325.2 7208.363 2 95120 party2 | 7832 140.4164 49.26602 100 329 name | 0 dwnom1 | 7832 -.0354424 .3335639 -1.07 1.37 dwnom2 | 7832 .0107231 .5186352 -1.83 1.43 partynm | 0 xincome | 6723 15494.69 10600.03 1968 64199 xhispct | 4780 6.610872 11.38954 .0137409 83.71677 order | 7832 3916.5 2261.048 1 7832**- Your assignment is to build a model of the
**Democratic Vote Share**. That is, use**demvshr**as your dependent variable (note, do**not**use**dempct**-- it has some errors in it!). You are free to use any independent variables you want but**you must include median family income**(**xincome**) in your specification. Whatever other independent variables you use, you**must**have a reasonable explanation for your specification!

- Note that
**xincome**is in**nominal**dollars! To see the distribution of**xincome**use the graph command in Stata; namely:

**graph xincome congress**

To correct the**xincome**variable as well as the**incomewh**and**incomebl**variables, you need to apply a price deflator. For congress 88 - 91 use 100/90.6, for 93 - 97 use 100/125.3, for 98 - 102 use 100/289.1, and for 103 - 104 use 100/420.3. These transformations will correct the income variables to 1967 dollars.

- When you have settled on your specification and have finished your analysis using
**Stata**, paste the variables that you settled on into**EVIEWS**and replicate your analysis using**EVIEWS**.

- Your assignment is to build a model of the