+0
Karma
Class:  HP 3302  BIOSTATISTICS 
Subject:  Health Professions 
University:  Texas State University  San Marcos 
Term:  Spring 2010 
Quantitative Research  Uses specific methods to advance the science base of the discipline by studying phenomena relevant to the goals of that discipline. includes experiments, surveys, correlational studies of various types, and some commonly encountered procedures such as metaanalysis and psychometric evaluations. 
population  the larger group of patients the researcher wants to draw conclusions about 
parameter  used when describing the characteristics of the population 
sample  the group of the patients the researcher actually studies 
statistic  used to describe the characteristics of the sample 
descriptive statistics  used to describe or characterize data by summarizing them into more understandable terms without losing or distorting much of the information. Summary tables, charts, frequencies, percentages, and measures of central tendency are the most common 
inferential statistics  consist of a set of statistical techniques that provide predictions about population characteristics based on information in a sample from that population. 
variables  a characteristic being measured that varies among the persons, events, or objects being studied. 
nominal scales  the lowest form of measurement allows the researcher to assign numbers that classify characteristics of people, objects, or events into categories. Ex: Gender: 0=Female 1=Male Adherence to Scheduled Apt.: 0= Canceled 1= Kept apt. 
Ordinal Scales  characteristics are placed in categories and the categories are ordered in some meaningful way. can be ranked from high to low, but the difference between the categories is unknown Ex: Pain Intensity: 0= no pain 1=little pain 2= moderate pain 3= severe pain 
Interval Scales  The distances between these ordered category values are equal because there is some accepted physical unit of measurement. Ex: Fahrenheit scale of temperature 
Ratio Scales  most precise level of measurement consists of meaningfully ordered characteristics with equal intervals between them and the presence of a zero point that is not arbitrary but is determined by nature. Ex: Blood pressure, pulse rate, and weight 
Atomicity Principle  you cannot analyze below the data level that you observe. 
Data Control Principle  take control of the structure and flow of your data. you should take responsibility for developing and monitoring the procedure for the layout of each respondent's data record. 
Data Efficiency People  Be efficient in getting your data into a computer, but not at the cost of losing crucial information. Do not handtotal respondents' scores on a 10item selfesteem scale and then enter only the total score as the measure of their selfesteem in each electronic respondent's data record 
Data Manipulation Principle  Let the computer do as much work as possible instruct it to do tasks such as recoding, variable computation, dataset catenation, dataset subsetting, data merging, and similar tasks that would waste your time. 
Kludge Principle  Sometimes the best way to manipulate data is not elegant and seems to waste computer resources. a kludge is sometimes justifiable; the end CAN justify the means. a Kludge is an awkward or clumsy patching together of a series of computer commands to make the data do what you want. 
Impossibility/ Implausibility Principle  Use the computer to check for impossible and implausible data. this should be done routinely by computing frequencies and measures of central tendency on all study variables and examining them for mistakes and/or bugs. If found, correct them immediately and then save the dataset. 
Burstein's Data Sensibility Principle  Run your data all the way through to the final computer analysis and ask yourself whether the results make sense. Be prepared to decide that they do not, and hence, be prepared to treat the analysis not as final, but as another debugging step. 
Error Typology Principle  Debugging includes detection and correction of errors. To ease correction, try to classify each error as you uncover it. 
Tables  They condense data into a form that can make them easier to understand, and they show many details in summary fashion. A disadvantage is because the reader sees only numbers, the table may not be readily understood without comparing it with other tables. 
Charts  Speak directly to the reader; despite their lack of exact details, charts are very effective in giving the reader a picture of differences and patterns in a set of data. 
Statistical Table  when data are organized into values or categories and then described with titles and captions a researcher begins to construct by tabulating data into a frequency distribution that is, by counting how often each value or category occurs in a variable or set of variables includes frequency, percent, valid percent, and cumulative percentages. 
Working Table  For interval or ratio variables If the difference between the maximum and the minimum value exceeds 15, the researcher may want to group the data into classes or categories before forming the final table. 
Considerations when drawing a chart  data structure, variable type, and measurement Do the data represent one point in time, indicating crosssectional data, or do they represent several points in time, called time series data? What type of variable do we wish to illustrate? Qualitative or Quantitative? What level of measurement is the variable of interest? 
Bar Chart  Used for nominal or ordinal data category labels are usually listed horizontally in some systematic order, and then vertical bars are drawn to represent the frequency or percentage in each category. 
Pie Chart  A circle that has been partitioned into percentage distributions of qualitative variables. Read the pie chart by beginning at the 12 o'clock position and proceeding clockwise. Use no more than 6 vectors Use percentages corresponding to each category rather than the absolute frequency of each category 
Histogram  appropriate for interval, ratio, and sometimes ordinal variables, are similar to bar charts, except the bars are placed side by side. the bar length represents the number of cases (frequency) falling within each interval. Used to represent percentages. 
Polygon  a chart for interval or ratio variables, is equivalent to the histogram but appears smoother. constructed by joining the midpoints of the top of each bar of the histogram and then closing the polygon at both ends by extending lines to imaginary midpoints at the left and right of the histogram. 
Measures of Central Tendency  mean, median, and mode describes where the values of a variable's distribution cluster 
Mean  Add up all the values in a distribution and divide by the number of values. the sum of the deviations of the values from the mean always equals zero. intended for interval or ratio variables when values can be added, but many times it is also sensible for ordinal variables. 
Sum of Squares  The sum of (XM)^2 At a minimum; that is, it is smaller than the sum of squares around any other value. 
Median  the middle value of a set of ordered numbers the point or value below which 50% of the distribution falls appropriate for interval or ratio data and for ordinal data but not for nominal data. 
Mode  the most frequent value or category in a distribution 
Standard Deviation  most widely used measure of variability SD= Square root of the sum of (XM)^2/n1 sensitive to extreme values 
variability  if scores in a distribution are similar, they are homogenous and have low variability. if scores are not similar, they are heterogenous and have high variability. 
Coefficient of Variation  useful statistic for comparing SD between several investigations examining the same variable CV=100(SD/M) 
Range  the simplest measure of variability. the difference between the maximum value of the distribution and the minimum value. 
Interquartile Range (IQR)  the range of values extending from the 25th percentile to the 75th percentile. not sensitive to extreme values 
Pearson's Skewness Coefficient  Skewness= (meanmedian)/SD skewness values fall between 1 and +1 SD units. skewness values about .2 or below 0.2 indicates severe skewness 
Fisher's Measure of Skewness  Based on deviations from the mean to the third power. A zscore is calculated by dividing the measure of skewness by the standard error for skewness. Values above +1.96 or below 1.96 are significant at the .05 level because 95% of the scores in a normal distribution fall between +1.96 and 1.96 SD from the mean. 
Fisher's Measure of Kurtosis  indicates whether a distribution has the right bell shape for a normal cure measures whether the bell shape is too flat or too peaked. if the kurtosis value is a large positive number, the distribution is too peaked to be normal (leptokurtic). If the kurtosis value is negative, the curve is too flat to be normal (platykurtic). 
Line Charts  Frequently used to display longitudinal trends. Time points in equal intervals are placed on the horizontal axis and the scale for the statistic on the vertical axis. Dots representing the statistic at each time point are then connected. 
Box Plots  also called a boxandwhiskers plot. a graphic display that uses descriptive statistics based on percentiles. displays the median, the IQR, and the smallest and largest values for a group. 
Outliers  values that are extreme relative to the bulk of scores in the distribution values that are more than 3 IQR's from the upper or lower edges of the box are extreme outliers. Values between 1.5 and 3 IQR's from the upper and lower edges of the box are minor outliers. 
Statistical Inference  involves obtaining information from a sample of data about the population from which the sample is drawn and setting up a model to describe this population. 
Random Sample  every member of the population has the same probability (chance) of being selected in the sample. If the population is a finite one in which every person in the population can be listed, a table of random numbers can then be used to select a random sample of any size. 
Parameter Estimation  takes two forms: point estimation and interval estimation. When an estimate of the population parameter is given as a single number, it is called a point estimate. The sample mean, median, variance, and SD are examples. Interval Estimation of a parameter involves more than one point; it consists of a range of values within which the population parameter is thought to be. 
zscore  z=(scoreM)/SD 68% of the scores fall between 1z and +1z 96% of the scores fall between 2z and +2z 
ChiSquare  compares the actual number (or frequency) in each group with the expected number. Used when the data are nominal (categorical) Assumptions: frequency data, adequate sample size, measures independent of each other, and theoretical basis for the categorization of the variables. Sum of (observedExpected)^2 divided by Expected 
Mann Whitney U test  non parametric version. used to compare two groups. analogous to the t test 
Front 
Back 


Quantitative Research  Uses specific methods to advance the science base of the discipline by studying phenomena relevant to the goals of that discipline. includes experiments, surveys, correlational studies of various types, and some commonly encountered procedures such as metaanalysis and psychometric evaluations.  
population  the larger group of patients the researcher wants to draw conclusions about  
parameter  used when describing the characteristics of the population  
sample  the group of the patients the researcher actually studies  
statistic  used to describe the characteristics of the sample  
descriptive statistics  used to describe or characterize data by summarizing them into more understandable terms without losing or distorting much of the information. Summary tables, charts, frequencies, percentages, and measures of central tendency are the most common  
inferential statistics  consist of a set of statistical techniques that provide predictions about population characteristics based on information in a sample from that population.  
variables  a characteristic being measured that varies among the persons, events, or objects being studied.  
nominal scales  the lowest form of measurement allows the researcher to assign numbers that classify characteristics of people, objects, or events into categories. Ex: Gender: 0=Female 1=Male Adherence to Scheduled Apt.: 0= Canceled 1= Kept apt.  
Ordinal Scales  characteristics are placed in categories and the categories are ordered in some meaningful way. can be ranked from high to low, but the difference between the categories is unknown Ex: Pain Intensity: 0= no pain 1=little pain 2= moderate pain 3= severe pain  
Interval Scales  The distances between these ordered category values are equal because there is some accepted physical unit of measurement. Ex: Fahrenheit scale of temperature  
Ratio Scales  most precise level of measurement consists of meaningfully ordered characteristics with equal intervals between them and the presence of a zero point that is not arbitrary but is determined by nature. Ex: Blood pressure, pulse rate, and weight  
Atomicity Principle  you cannot analyze below the data level that you observe.  
Data Control Principle  take control of the structure and flow of your data. you should take responsibility for developing and monitoring the procedure for the layout of each respondent's data record.  
Data Efficiency People  Be efficient in getting your data into a computer, but not at the cost of losing crucial information. Do not handtotal respondents' scores on a 10item selfesteem scale and then enter only the total score as the measure of their selfesteem in each electronic respondent's data record  
Data Manipulation Principle  Let the computer do as much work as possible instruct it to do tasks such as recoding, variable computation, dataset catenation, dataset subsetting, data merging, and similar tasks that would waste your time.  
Kludge Principle  Sometimes the best way to manipulate data is not elegant and seems to waste computer resources. a kludge is sometimes justifiable; the end CAN justify the means. a Kludge is an awkward or clumsy patching together of a series of computer commands to make the data do what you want.  
Impossibility/ Implausibility Principle  Use the computer to check for impossible and implausible data. this should be done routinely by computing frequencies and measures of central tendency on all study variables and examining them for mistakes and/or bugs. If found, correct them immediately and then save the dataset.  
Burstein's Data Sensibility Principle  Run your data all the way through to the final computer analysis and ask yourself whether the results make sense. Be prepared to decide that they do not, and hence, be prepared to treat the analysis not as final, but as another debugging step.  
Error Typology Principle  Debugging includes detection and correction of errors. To ease correction, try to classify each error as you uncover it.  
Tables  They condense data into a form that can make them easier to understand, and they show many details in summary fashion. A disadvantage is because the reader sees only numbers, the table may not be readily understood without comparing it with other tables.  
Charts  Speak directly to the reader; despite their lack of exact details, charts are very effective in giving the reader a picture of differences and patterns in a set of data.  
Statistical Table  when data are organized into values or categories and then described with titles and captions a researcher begins to construct by tabulating data into a frequency distribution that is, by counting how often each value or category occurs in a variable or set of variables includes frequency, percent, valid percent, and cumulative percentages.  
Working Table  For interval or ratio variables If the difference between the maximum and the minimum value exceeds 15, the researcher may want to group the data into classes or categories before forming the final table.  
Considerations when drawing a chart  data structure, variable type, and measurement Do the data represent one point in time, indicating crosssectional data, or do they represent several points in time, called time series data? What type of variable do we wish to illustrate? Qualitative or Quantitative? What level of measurement is the variable of interest?  
Bar Chart  Used for nominal or ordinal data category labels are usually listed horizontally in some systematic order, and then vertical bars are drawn to represent the frequency or percentage in each category.  
Pie Chart  A circle that has been partitioned into percentage distributions of qualitative variables. Read the pie chart by beginning at the 12 o'clock position and proceeding clockwise. Use no more than 6 vectors Use percentages corresponding to each category rather than the absolute frequency of each category  
Histogram  appropriate for interval, ratio, and sometimes ordinal variables, are similar to bar charts, except the bars are placed side by side. the bar length represents the number of cases (frequency) falling within each interval. Used to represent percentages.  
Polygon  a chart for interval or ratio variables, is equivalent to the histogram but appears smoother. constructed by joining the midpoints of the top of each bar of the histogram and then closing the polygon at both ends by extending lines to imaginary midpoints at the left and right of the histogram.  
Measures of Central Tendency  mean, median, and mode describes where the values of a variable's distribution cluster  
Mean  Add up all the values in a distribution and divide by the number of values. the sum of the deviations of the values from the mean always equals zero. intended for interval or ratio variables when values can be added, but many times it is also sensible for ordinal variables.  
Sum of Squares  The sum of (XM)^2 At a minimum; that is, it is smaller than the sum of squares around any other value.  
Median  the middle value of a set of ordered numbers the point or value below which 50% of the distribution falls appropriate for interval or ratio data and for ordinal data but not for nominal data.  
Mode  the most frequent value or category in a distribution  
Standard Deviation  most widely used measure of variability SD= Square root of the sum of (XM)^2/n1 sensitive to extreme values  
variability  if scores in a distribution are similar, they are homogenous and have low variability. if scores are not similar, they are heterogenous and have high variability.  
Coefficient of Variation  useful statistic for comparing SD between several investigations examining the same variable CV=100(SD/M)  
Range  the simplest measure of variability. the difference between the maximum value of the distribution and the minimum value.  
Interquartile Range (IQR)  the range of values extending from the 25th percentile to the 75th percentile. not sensitive to extreme values  
Pearson's Skewness Coefficient  Skewness= (meanmedian)/SD skewness values fall between 1 and +1 SD units. skewness values about .2 or below 0.2 indicates severe skewness  
Fisher's Measure of Skewness  Based on deviations from the mean to the third power. A zscore is calculated by dividing the measure of skewness by the standard error for skewness. Values above +1.96 or below 1.96 are significant at the .05 level because 95% of the scores in a normal distribution fall between +1.96 and 1.96 SD from the mean.  
Fisher's Measure of Kurtosis  indicates whether a distribution has the right bell shape for a normal cure measures whether the bell shape is too flat or too peaked. if the kurtosis value is a large positive number, the distribution is too peaked to be normal (leptokurtic). If the kurtosis value is negative, the curve is too flat to be normal (platykurtic).  
Line Charts  Frequently used to display longitudinal trends. Time points in equal intervals are placed on the horizontal axis and the scale for the statistic on the vertical axis. Dots representing the statistic at each time point are then connected.  
Box Plots  also called a boxandwhiskers plot. a graphic display that uses descriptive statistics based on percentiles. displays the median, the IQR, and the smallest and largest values for a group.  
Outliers  values that are extreme relative to the bulk of scores in the distribution values that are more than 3 IQR's from the upper or lower edges of the box are extreme outliers. Values between 1.5 and 3 IQR's from the upper and lower edges of the box are minor outliers.  
Statistical Inference  involves obtaining information from a sample of data about the population from which the sample is drawn and setting up a model to describe this population.  
Random Sample  every member of the population has the same probability (chance) of being selected in the sample. If the population is a finite one in which every person in the population can be listed, a table of random numbers can then be used to select a random sample of any size.  
Parameter Estimation  takes two forms: point estimation and interval estimation. When an estimate of the population parameter is given as a single number, it is called a point estimate. The sample mean, median, variance, and SD are examples. Interval Estimation of a parameter involves more than one point; it consists of a range of values within which the population parameter is thought to be.  
zscore  z=(scoreM)/SD 68% of the scores fall between 1z and +1z 96% of the scores fall between 2z and +2z  
ChiSquare  compares the actual number (or frequency) in each group with the expected number. Used when the data are nominal (categorical) Assumptions: frequency data, adequate sample size, measures independent of each other, and theoretical basis for the categorization of the variables. Sum of (observedExpected)^2 divided by Expected  
Mann Whitney U test  non parametric version. used to compare two groups. analogous to the t test 
© Copyright 2020 , Koofers, Inc. All rights reserved.
The information provided on this site is protected by U.S. and International copyright law, and other applicable intellectual property laws, including laws covering data access and data compilations. This information is provided exclusively for the personal and academic use of students, instructors and other university personnel. Use of this information for any commercial purpose, or by any commercial entity, is expressly prohibited. This information may not, under any circumstances, be copied, modified, reused, or incorporated into any derivative works or compilations, without the prior written approval of Koofers, Inc.