# Final Exam - Flashcards

## Flashcard Deck Information

 Class: HP 3302 - BIOSTATISTICS Subject: Health Professions University: Texas State University - San Marcos Term: Spring 2010
- of - - INCORRECT     - CORRECT     - SKIPPED
Hide Keyboard shortcuts
Next card
Previous card
Mark correct
Mark incorrect
Flip card
Start Over
Shuffle
Mode:         ? pages Quantitative Research Uses specific methods to advance the science base of the discipline by studying phenomena relevant to the goals of that discipline. includes experiments, surveys, correlational studies of various types, and some commonly encountered procedures such as meta-analysis and psychometric evaluations. population the larger group of patients the researcher wants to draw conclusions about parameter used when describing the characteristics of the population sample the group of the patients the researcher actually studies
Generated by Koofers.com
 statistic used to describe the characteristics of the sample descriptive statistics used to describe or characterize data by summarizing them into more understandable terms without losing or distorting much of the information. Summary tables, charts, frequencies, percentages, and measures of central tendency are the most common inferential statistics consist of a set of statistical techniques that provide predictions about population characteristics based on information in a sample from that population. variables a characteristic being measured that varies among the persons, events, or objects being studied.
Generated by Koofers.com
 nominal scales the lowest form of measurement allows the researcher to assign numbers that classify characteristics of people, objects, or events into categories. Ex: Gender: 0=Female 1=Male Adherence to Scheduled Apt.: 0= Canceled 1= Kept apt. Ordinal Scales characteristics are placed in categories and the categories are ordered in some meaningful way. can be ranked from high to low, but the difference between the categories is unknown Ex: Pain Intensity: 0= no pain 1=little pain 2= moderate pain 3= severe pain Interval Scales The distances between these ordered category values are equal because there is some accepted physical unit of measurement. Ex: Fahrenheit scale of temperature Ratio Scales most precise level of measurement consists of meaningfully ordered characteristics with equal intervals between them and the presence of a zero point that is not arbitrary but is determined by nature. Ex: Blood pressure, pulse rate, and weight
Generated by Koofers.com
 Atomicity Principle you cannot analyze below the data level that you observe. Data Control Principle take control of the structure and flow of your data. you should take responsibility for developing and monitoring the procedure for the layout of each respondent's data record. Data Efficiency People Be efficient in getting your data into a computer, but not at the cost of losing crucial information. Do not hand-total respondents' scores on a 10-item self-esteem scale and then enter only the total score as the measure of their self-esteem in each electronic respondent's data record Data Manipulation Principle Let the computer do as much work as possible instruct it to do tasks such as recoding, variable computation, dataset catenation, dataset subsetting, data merging, and similar tasks that would waste your time.
Generated by Koofers.com
 Kludge Principle Sometimes the best way to manipulate data is not elegant and seems to waste computer resources. a kludge is sometimes justifiable; the end CAN justify the means. a Kludge is an awkward or clumsy patching together of a series of computer commands to make the data do what you want. Impossibility/ Implausibility Principle Use the computer to check for impossible and implausible data. this should be done routinely by computing frequencies and measures of central tendency on all study variables and examining them for mistakes and/or bugs. If found, correct them immediately and then save the dataset. Burstein's Data Sensibility Principle Run your data all the way through to the final computer analysis and ask yourself whether the results make sense. Be prepared to decide that they do not, and hence, be prepared to treat the analysis not as final, but as another debugging step. Error Typology Principle Debugging includes detection and correction of errors. To ease correction, try to classify each error as you uncover it.
Generated by Koofers.com
 Tables They condense data into a form that can make them easier to understand, and they show many details in summary fashion. A disadvantage is because the reader sees only numbers, the table may not be readily understood without comparing it with other tables. Charts Speak directly to the reader; despite their lack of exact details, charts are very effective in giving the reader a picture of differences and patterns in a set of data. Statistical Table when data are organized into values or categories and then described with titles and captions a researcher begins to construct by tabulating data into a frequency distribution- that is, by counting how often each value or category occurs in a variable or set of variables includes frequency, percent, valid percent, and cumulative percentages. Working Table For interval or ratio variables If the difference between the maximum and the minimum value exceeds 15, the researcher may want to group the data into classes or categories before forming the final table.
Generated by Koofers.com
 Considerations when drawing a chart data structure, variable type, and measurement Do the data represent one point in time, indicating cross-sectional data, or do they represent several points in time, called time series data? What type of variable do we wish to illustrate? Qualitative or Quantitative? What level of measurement is the variable of interest? Bar Chart Used for nominal or ordinal data category labels are usually listed horizontally in some systematic order, and then vertical bars are drawn to represent the frequency or percentage in each category. Pie Chart A circle that has been partitioned into percentage distributions of qualitative variables. Read the pie chart by beginning at the 12 o'clock position and proceeding clockwise. Use no more than 6 vectors Use percentages corresponding to each category rather than the absolute frequency of each category Histogram appropriate for interval, ratio, and sometimes ordinal variables, are similar to bar charts, except the bars are placed side by side. the bar length represents the number of cases (frequency) falling within each interval. Used to represent percentages.
Generated by Koofers.com
 Polygon a chart for interval or ratio variables, is equivalent to the histogram but appears smoother. constructed by joining the midpoints of the top of each bar of the histogram and then closing the polygon at both ends by extending lines to imaginary midpoints at the left and right of the histogram. Measures of Central Tendency mean, median, and mode describes where the values of a variable's distribution cluster Mean Add up all the values in a distribution and divide by the number of values. the sum of the deviations of the values from the mean always equals zero. intended for interval or ratio variables when values can be added, but many times it is also sensible for ordinal variables. Sum of Squares The sum of (X-M)^2 At a minimum; that is, it is smaller than the sum of squares around any other value.
Generated by Koofers.com
 Median the middle value of a set of ordered numbers the point or value below which 50% of the distribution falls appropriate for interval or ratio data and for ordinal data but not for nominal data. Mode the most frequent value or category in a distribution Standard Deviation most widely used measure of variability SD= Square root of the sum of (X-M)^2/n-1 sensitive to extreme values variability if scores in a distribution are similar, they are homogenous and have low variability. if scores are not similar, they are heterogenous and have high variability.
Generated by Koofers.com
 Coefficient of Variation useful statistic for comparing SD between several investigations examining the same variable CV=100(SD/M) Range the simplest measure of variability. the difference between the maximum value of the distribution and the minimum value. Interquartile Range (IQR) the range of values extending from the 25th percentile to the 75th percentile. not sensitive to extreme values Pearson's Skewness Coefficient Skewness= (mean-median)/SD skewness values fall between -1 and +1 SD units. skewness values about .2 or below -0.2 indicates severe skewness
Generated by Koofers.com
 Fisher's Measure of Skewness Based on deviations from the mean to the third power. A z-score is calculated by dividing the measure of skewness by the standard error for skewness. Values above +1.96 or below -1.96 are significant at the .05 level because 95% of the scores in a normal distribution fall between +1.96 and -1.96 SD from the mean. Fisher's Measure of Kurtosis indicates whether a distribution has the right bell shape for a normal cure measures whether the bell shape is too flat or too peaked. if the kurtosis value is a large positive number, the distribution is too peaked to be normal (leptokurtic). If the kurtosis value is negative, the curve is too flat to be normal (platykurtic). Line Charts Frequently used to display longitudinal trends. Time points in equal intervals are placed on the horizontal axis and the scale for the statistic on the vertical axis. Dots representing the statistic at each time point are then connected. Box Plots also called a box-and-whiskers plot. a graphic display that uses descriptive statistics based on percentiles. displays the median, the IQR, and the smallest and largest values for a group.
Generated by Koofers.com
 Outliers values that are extreme relative to the bulk of scores in the distribution values that are more than 3 IQR's from the upper or lower edges of the box are extreme outliers. Values between 1.5 and 3 IQR's from the upper and lower edges of the box are minor outliers. Statistical Inference involves obtaining information from a sample of data about the population from which the sample is drawn and setting up a model to describe this population. Random Sample every member of the population has the same probability (chance) of being selected in the sample. If the population is a finite one in which every person in the population can be listed, a table of random numbers can then be used to select a random sample of any size. Parameter Estimation takes two forms: point estimation and interval estimation. When an estimate of the population parameter is given as a single number, it is called a point estimate. The sample mean, median, variance, and SD are examples. Interval Estimation of a parameter involves more than one point; it consists of a range of values within which the population parameter is thought to be.
Generated by Koofers.com
 z-score z=(score-M)/SD 68% of the scores fall between -1z and +1z 96% of the scores fall between -2z and +2z Chi-Square compares the actual number (or frequency) in each group with the expected number. Used when the data are nominal (categorical) Assumptions: frequency data, adequate sample size, measures independent of each other, and theoretical basis for the categorization of the variables. Sum of (observed-Expected)^2 divided by Expected Mann Whitney U test non parametric version. used to compare two groups. analogous to the t test
Generated by Koofers.com

## List View: Terms & Definitions

Front
Back
Quantitative ResearchUses specific methods to advance the science base of the discipline by studying phenomena relevant to the goals of that discipline.
includes experiments, surveys, correlational studies of various types, and some commonly encountered procedures such as meta-analysis and psychometric evaluations.
populationthe larger group of patients the researcher wants to draw conclusions about
parameterused when describing the characteristics of the population
samplethe group of the patients the researcher actually studies
statisticused to describe the characteristics of the sample
descriptive statisticsused to describe or characterize data by summarizing them into more understandable terms without losing or distorting much of the information.
Summary tables, charts, frequencies, percentages, and measures of central tendency are the most common
inferential statisticsconsist of a set of statistical techniques that provide predictions about population characteristics based on information in a sample from that population.
variablesa characteristic being measured that varies among the persons, events, or objects being studied.
nominal scalesthe lowest form of measurement
allows the researcher to assign numbers that classify characteristics of people, objects, or events into categories.
Ex: Gender: 0=Female 1=Male
Adherence to Scheduled Apt.: 0= Canceled 1= Kept apt.
Ordinal Scalescharacteristics are placed in categories and the categories are ordered in some meaningful way.
can be ranked from high to low, but the difference between the categories is unknown
Ex: Pain Intensity: 0= no pain 1=little pain 2= moderate pain 3= severe pain
Interval ScalesThe distances between these ordered category values are equal because there is some accepted physical unit of measurement.
Ex: Fahrenheit scale of temperature
Ratio Scalesmost precise level of measurement
consists of meaningfully ordered characteristics with equal intervals between them and the presence of a zero point that is not arbitrary but is determined by nature.
Ex: Blood pressure, pulse rate, and weight
Atomicity Principleyou cannot analyze below the data level that you observe.
Data Control Principletake control of the structure and flow of your data.
you should take responsibility for developing and monitoring the procedure for the layout of each respondent's data record.
Data Efficiency PeopleBe efficient in getting your data into a computer, but not at the cost of losing crucial information. Do not hand-total respondents' scores on a 10-item self-esteem scale and then enter only the total score as the measure of their self-esteem in each electronic respondent's data record
Data Manipulation PrincipleLet the computer do as much work as possible
instruct it to do tasks such as recoding, variable computation, dataset catenation, dataset subsetting, data merging, and similar tasks that would waste your time.
Kludge PrincipleSometimes the best way to manipulate data is not elegant and seems to waste computer resources. a kludge is sometimes justifiable; the end CAN justify the means.
a Kludge is an awkward or clumsy patching together of a series of computer commands to make the data do what you want.
Impossibility/ Implausibility PrincipleUse the computer to check for impossible and implausible data. this should be done routinely by computing frequencies and measures of central tendency on all study variables and examining them for mistakes and/or bugs. If found, correct them immediately and then save the dataset.
Burstein's Data Sensibility PrincipleRun your data all the way through to the final computer analysis and ask yourself whether the results make sense. Be prepared to decide that they do not, and hence, be prepared to treat the analysis not as final, but as another debugging step.
Error Typology PrincipleDebugging includes detection and correction of errors. To ease correction, try to classify each error as you uncover it.
TablesThey condense data into a form that can make them easier to understand, and they show many details in summary fashion.
A disadvantage is because the reader sees only numbers, the table may not be readily understood without comparing it with other tables.
ChartsSpeak directly to the reader; despite their lack of exact details, charts are very effective in giving the reader a picture of differences and patterns in a set of data.
Statistical Tablewhen data are organized into values or categories and then described with titles and captions
a researcher begins to construct by tabulating data into a frequency distribution- that is, by counting how often each value or category occurs in a variable or set of variables
includes frequency, percent, valid percent, and cumulative percentages.
Working TableFor interval or ratio variables
If the difference between the maximum and the minimum value exceeds 15, the researcher may want to group the data into classes or categories before forming the final table.
Considerations when drawing a chartdata structure, variable type, and measurement
Do the data represent one point in time, indicating cross-sectional data, or do they represent several points in time, called time series data?
What type of variable do we wish to illustrate? Qualitative or Quantitative?
What level of measurement is the variable of interest?
Bar ChartUsed for nominal or ordinal data
category labels are usually listed horizontally in some systematic order, and then vertical bars are drawn to represent the frequency or percentage in each category.
Pie ChartA circle that has been partitioned into percentage distributions of qualitative variables.
Read the pie chart by beginning at the 12 o'clock position and proceeding clockwise.
Use no more than 6 vectors
Use percentages corresponding to each category rather than the absolute frequency of each category
Histogramappropriate for interval, ratio, and sometimes ordinal variables, are similar to bar charts, except the bars are placed side by side. the bar length represents the number of cases (frequency) falling within each interval. Used to represent percentages.
Polygona chart for interval or ratio variables, is equivalent to the histogram but appears smoother. constructed by joining the midpoints of the top of each bar of the histogram and then closing the polygon at both ends by extending lines to imaginary midpoints at the left and right of the histogram.
Measures of Central Tendencymean, median, and mode
describes where the values of a variable's distribution cluster
MeanAdd up all the values in a distribution and divide by the number of values. the sum of the deviations of the values from the mean always equals zero.
intended for interval or ratio variables when values can be added, but many times it is also sensible for ordinal variables.
Sum of SquaresThe sum of (X-M)^2
At a minimum; that is, it is smaller than the sum of squares around any other value.
Medianthe middle value of a set of ordered numbers
the point or value below which 50% of the distribution falls
appropriate for interval or ratio data and for ordinal data but not for nominal data.
Modethe most frequent value or category in a distribution
Standard Deviationmost widely used measure of variability
SD= Square root of the sum of (X-M)^2/n-1
sensitive to extreme values
variabilityif scores in a distribution are similar, they are homogenous and have low variability. if scores are not similar, they are heterogenous and have high variability.
Coefficient of Variationuseful statistic for comparing SD between several investigations examining the same variable
CV=100(SD/M)
Rangethe simplest measure of variability. the difference between the maximum value of the distribution and the minimum value.
Interquartile Range (IQR)the range of values extending from the 25th percentile to the 75th percentile.
not sensitive to extreme values
Pearson's Skewness CoefficientSkewness= (mean-median)/SD
skewness values fall between -1 and +1 SD units.
skewness values about .2 or below -0.2 indicates severe skewness
Fisher's Measure of SkewnessBased on deviations from the mean to the third power.
A z-score is calculated by dividing the measure of skewness by the standard error for skewness.
Values above +1.96 or below -1.96 are significant at the .05 level because 95% of the scores in a normal distribution fall between +1.96 and -1.96 SD from the mean.
Fisher's Measure of Kurtosisindicates whether a distribution has the right bell shape for a normal cure
measures whether the bell shape is too flat or too peaked.
if the kurtosis value is a large positive number, the distribution is too peaked to be normal (leptokurtic). If the kurtosis value is negative, the curve is too flat to be normal (platykurtic).
Line ChartsFrequently used to display longitudinal trends. Time points in equal intervals are placed on the horizontal axis and the scale for the statistic on the vertical axis. Dots representing the statistic at each time point are then connected.
Box Plotsalso called a box-and-whiskers plot. a graphic display that uses descriptive statistics based on percentiles. displays the median, the IQR, and the smallest and largest values for a group.
Outliersvalues that are extreme relative to the bulk of scores in the distribution
values that are more than 3 IQR's from the upper or lower edges of the box are extreme outliers. Values between 1.5 and 3 IQR's from the upper and lower edges of the box are minor outliers.
Statistical Inferenceinvolves obtaining information from a sample of data about the population from which the sample is drawn and setting up a model to describe this population.
Random Sampleevery member of the population has the same probability (chance) of being selected in the sample. If the population is a finite one in which every person in the population can be listed, a table of random numbers can then be used to select a random sample of any size.
Parameter Estimationtakes two forms: point estimation and interval estimation.
When an estimate of the population parameter is given as a single number, it is called a point estimate. The sample mean, median, variance, and SD are examples. Interval Estimation of a parameter involves more than one point; it consists of a range of values within which the population parameter is thought to be.
z-scorez=(score-M)/SD
68% of the scores fall between -1z and +1z
96% of the scores fall between -2z and +2z
Chi-Squarecompares the actual number (or frequency) in each group with the expected number.
Used when the data are nominal (categorical)
Assumptions: frequency data, adequate sample size, measures independent of each other, and theoretical basis for the categorization of the variables.
Sum of (observed-Expected)^2 divided by Expected
Mann Whitney U testnon parametric version. used to compare two groups. analogous to the t test