Sunday, April 30, 2006

Advanced Busines Statistics

Stats Chapter 1


1.1 Population and samples

A population is a set of existing units. Any characteristic of a population unit is called a variable. We carry out a measurement to assign a value to a variable for a production unit. When possible we use numbers that represent quantities calling them quantitative. Other times we will put them into categories calling them qualitative or categorical. When we measure every population unit we have a population of measurement. Examining them all is conducting a census.


A sample is a subset of the units in a population. When we measure the units in a sample we have a sample of measurements. Descriptive statistics is the science of describing the important aspects of a set of measurements. When we have to select a sample of a population we use statistical inference, the science of using a sample of measurements to make generalizations about the important aspects of a population of measurements. Often we use estimates to figure important aspects.



1.2 Sampling a population of existing Units


To accurately reflect the population of a study, a sample should be randomly selected. With a large sample a random number table is used. A random sample is used so all elements of the sample hare an equal chance to be chosen. If the element is put back in this is called 'Sample with Replacements.' If it gets removed it is called 'Sample without Replacement.' It is best to use sample without replacement. To take a random sample we must have a list (or a frame). Sometimes you can not number a sample to make a frame. In this case we use a systematic sample. In this case we would use every n population, where n= a number. Sometimes it is important to make sure a sample is representative of a true set of the population. It is important to get a random (or an approximately random) sample. A bad example of this would be a a voluntary response sample. This is where a participant selects himself. This is not a random sample.


1.3 Sampling a process


A process is a sequence of operations that takes inputs and turns them into outputs Some times we just need to sample these outputs. We divide these into finite population, a limited amount, and an infinite population, one that is unlimited.


When we look at results We look for statistical control, we see if there are any unusual process variations. Using our findings we create a run plot. This is a graph that shows our measurement over time. This chant is often called a line chant. If the process stays in statistical control it is said its performance is predictable. Though a process is in statistical control, it may not be capable of producing the desired output.


1.4 Ratio Interval, Ordinal, and Nominative Scales of Measurements


A variable is quantitative if its values are representative of quantities. These have a fixed unit of measurement. There are two types of quantitative variables: ratio intervals, ones measured on a scale and interval variables, temperature being the classic example (32 degrees is cold, but it is not no heat).


Earlier we talked about qualitative which break down to ordinal and nominative. Ordinal means that the values have ranking and they can be numerical or non-numerical or may be a mixture of the two.


A nominative variable is a qualitative variable for which there is no meaningful ranking as for example a persons sex.


1.5 An Introduction to survey Sampling


There are three types of sampling design. The first is a stratified random sample. Here, we divide the population into non-overlapping groups of similar units. These are called strata. From here a random sample is selected. This would be done when a sample would have different stratas based on a difference in units (like geographic locations) or sex. Some times we take a sample of a sample, this is called multistage cluster sampling. It can be a good idea to combine stratification with clustering.


When dealings with a sample, even though it is random, it is not necessarily accurate, as we can hare some populations excluded from the sampling process. This is called undercoverage An example would be that low income or young populations do not have phones. Using a phone book to get a random population will eliminate them and skew the results.


Nonresponse can also be a problem. This is when a group cannot be contacted on refuses to participate. This can also happen when people are asked questions that could embarrasses them. Lastly we have response bias that happens when the wording of a question or the responses can influence the answer received.