# Statistics and plotting on the TI-89 family

## by Sam Jordan

### edited by Ray Kremer

For those who find the TI-89 family manual hard to follow when it comes to using the built in statistics functions, I have revamped Sam Jordan's TI-86 statistics tutorial so that it applies to the TI-89 family. If you are doing a lot of statistics with your TI-89 family calculator, you might also be interested in TI's Statistics Flash Application which replicates the TI-83's statistical abilities.

Other guides for TI-89 family statistics have been written by Karl Parr and William Larson.

Back to the FAQ

#### Statistical Analysis

The first step in performing a statistical analysis on the TI-89/92(+) is determining the type of data you will be working with. The two types are single variable data sets, and paired variable data sets. On the TI-89/92(+) these are called OneVar and TwoVar (short for One variable and Two variable) data sets.

An example of a single variable data set would be the height in inches of all male students in the Statistics 101 class. Let us assume that there are 9 male students in the statistics class with the following heights measured to the nearest inch.

64, 66, 68, 69, 70, 70, 71, 72, 74

The TI-89/92(+) will perform a single variable analysis of this data if we enter it in the Data/Matrix Editor, and store it in a data variable. Use the Apps menu to create a data variable and open it in the editor. Type the data into the column "c1".

We can now find the "OneVar" or "one variable" statistics for this set of data. Enter the Calc window by pressing F5. Change the "Calculation Type" drop down box to OneVar. Arrow down to the "x" line, type c1, and hit [ENTER] to save it. Hit [ENTER] again to run the analysis.

This will display the following:

STAT VARS
x-bar =69.333333
Ex =624.
Ex2 =43338.
Sx =3.041381
nStat =9.
minX =64.
q1 =67.
MedStat =70.
q3 =71.5
maxX =74.

Actually, I couldn't type the exact variable names that appeared in the above output, but the actual ones are described in the manual and are the:

Mean
Sum of X
Sum of squared X
Sample Standard Deviation of X
Number of data values
Minimum X value
1st Quartile
Median X
3rd Quartile
Maximum X

The population standard deviation isn't displayed, but it is calculated and stored to the variable sigma x. You can see the value by typing sigma x into the entry line and hitting [ENTER]. The lowercase sigma can be obtained from the Char - Greek menu (item G) or by using the following keystrokes:
On the TI-89: diamond ( alpha s
On the TI-92(+): diamond g s

Since at least one of the values in the "c1" column is repeated, we could have entered "c1" as:

64, 66, 68, 69, 70, 71, 72, 74

While also filling in the "c2" column which gives the "frequency" of each of the values in the "c1" column

1, 1, 1, 1, 2, 1, 1, 1

which in this case means that the value "70" occurs 2 times in the original column.

Assuming that we have entered these two columns, and that we have exactly the same number of entries (8) in both columns, we go to the Calc window and choose OneVar and c1 as before, but this time arrow down to "Use Freq and Categories?" and select Yes. Then arrow down to the Freq line and type in c2 then [ENTER] to save it. [ENTER] again runs the analysis.

That pretty much covers single variable statistics.

Two variable statistics are handled in much the same way as One variable stats.

Assume we have the following (x,y) pairs:

(1,3) (2,4.1) (3,5) (4,5.9)

These would be entered into the data editor:

1, 2, 3, 4 as c1
3, 4.1, 5, 5.9 as c2

Notice that both columns contain the same number of items. THIS IS IMPORTANT! If they don't contain the same number of items, you will get weird errors when trying to perform analysis of the data later! Once again, an optional third column may be used for frequency.

"TwoVar" appears as the default choice in the "Calculation Type" menu in the Calc window. Entering c1 for x and c2 for y and run the analysis. TwoVar will give the following:

STAT VARS
x-bar =2.5
y-bar =4.5
Ex =10.
Ex2 =30.
Ey =18.
Ey2 =85.62
Exy =49.8
Sx =1.290994
Sy =1.240967
nStat =4.
minX =1.
minY =3.
maxX =4.
maxY =5.9

Again, I couldn't type the exact output, but the actual stuff is described in the manual. The values are actually stored to variables, you can use them afterward by typing the variable names (use Greek letters and the CHAR menu when necessary) or by taking the variable names from the VAR-LINK screen, where they appear if you go into F2 View and set it to display the system variables.

To do regressions on the same data as entered in c1 and c2, you can use the regressions form the "Calculation Type" menu. For example, you can perform a linear regression on that data by "LinReg".

This will produce:
STAT VARS
y=a*x+b
a =.96
b =2.1
corr =.9987
R2 =.997403

which indicates the line:
y=.96*x+2.1
which has a linear correlation of .9987 with the data.

The equation is stored to the function variable regeq(). If in the Calc window's "Store RegEQ to" menu you choose one of the y(x) variables, it will be stored there as well.

#### Entry Line Statistics

I can't say that I've tried this, but in Appendix A of the TI-89/92+ manual, it lists the syntax for invoking the statistical analysis commands from the entry line. This is useful mainly for programs. When a statistical calculation is run from the entry line it does not display the results unless invoked with the ShowStat command.

#### Stat Plotting

Assuming that you have data in a column:

1, 1, 2, 2, 3, 4 as c1

1. First open the Plot Setup window by pressing [F2]. Turn on Plot 1 by selecting Plot1 with the cursor and pressing [F1]. Use the "Plot Type" drop down menu to select the plot type that you want. For single variable data, the only valid types are Box Plot or Histogram or Mod Box Plot. You can then cursor down to the "x" line and enter the name of the column in which you have stored your data and then press enter to have it remembered. You can also activate the frequency area if you want to use a frequency colum which MUST have the same number of entries as your single variable list. When this data is correct, press [ENTER] to save the plot.

2. You can plot your data by hitting [diamond][GRAPH] which will draw the selected plot.

To plot two variable data, then instructions are similar to the above, except that the valid plot types then become either Scatter or xyline.

#### Regression Equation Graphs

After you've read everything above on how to enter data and calculate basic stats on it and on how to Plot the data you entered you might also want to be able to graph any Regression equation that you got during the analysis just to see how it looks against the data that you entered and plotted.

The good news is that this is fairly simple. After you've entered all your data and run your Regression function (usually "LinReg" for a Linear Regression) the calculator will store the actual Regression expression into one of the Statistics variables called "regeq". To graph it, all you need to do is go into the y= screen to enter a new graphing equation. Assuming that you use "y1=" the actual entry you make needs to be:

y1=regeq(x)

Now simply make sure that this selected and graph it like you would any other equation. If you want to see what the actual expression is, you can execute
[2nd][Rcl] reqeq
from the home screen.

It is often instructive to use the previously posted "Plot" instructions to generate a "Scatter Plot" of your x and y lists (for two variable statistics) and then generate and graph the Regression Equation so that they are both displayed at the same time to get a visual view of how well the equation fits the data. If you decide to try a different regression equation, all you have to do is go back to the stats menu and select the new regression function. The regression will run and store the equation into the same reqeq variable as before, and you can then graph it.

If you DON'T want the "y1=" graph to change each time you run a new regression, then when you originally set it up, instead of entering
y1=regeq(x)
then at the "y1=" prompt you need to enter
[2nd][Rcl] regeq [ENTER]
to set it equal to the actual expression which won't change then next time you assign something new to regeq by running a new regression.

#### Computing and Plotting Residuals

Residuals are defined as the difference between the "observed" and "predicted" values of a regression.

You begin with a set of x,y data pairs:
(1,3) (2,4.1) (3,5) (4,5.9)

You enter these into the data editor:
1, 2, 3, 4 as c1
3, 4.1, 5, 5.9 as c2

(Note that all lists have the same number of elements, and "fStat" is set to all ones. If not, it won't work!)

After you've entered these lists, you can use the Plot Setup menu to plot them as a scatter plot.

You can also run "LinReg" to generate their regression equation which provides the line of "best fit" to the points. After doing this, you can assign the resulting "regeq" to one of the graph function variables so that you can graph it against your original scatter plot. The easiest way is simply to enter
y1=regeq(x)
on the y= screen. You can also set it equal to the actual regression equation by doing
[2nd][RCL] regeq
on the home screen to recall the actual equation value as:
.96*x+2.1
following it with y1(x) and then hitting enter to store it.

If you need to compute the "residual" difference between the regression equation and the original c2 values, that is done by returning to the data editor, moving the cursor to the box with c3 in it, and typing:
c2-y1(c1)

which in this case should leave c3 with the contents:
-.06, .08, .02, -.04

If you then want to plot the residuals, you may do so by defining a scatter plot with:
x as c1
y as c3

and make sure that you set your ymin and ymax values small enough based on the largest and smallest values in c3 so that you can see the plot.

This same technique can be used to plot the residuals of ANY regression.

Back to the FAQ

visitors since 8/1/2001