+0
Karma
Class:  PSYCH 225  Experimental Psychology 
Subject:  PSYCHOLOGY 
University:  University of Wisconsin  Madison 
Term:  Spring 2011 
Conflicts re the nature of science  media: “commercialization” of science, heuristics used in advertising, using and abusing science, courts: expert testimony (delivered with a twist), role of eyewitnesses, government and authorities: global warming, the politicizing of science 
Science as attitudes  an approach/orientation; various “ways of knowing” 
Empiricism  emphasizes the role of experience and evidence, especially sensory perception, in the formation of ideas, over the notion of innate ideas or tradition in contrast to 
Limits of empiricism  What if observations are distorted? (can’t directly observe thinking, for example.)  Accuracy? Selectivity? Bias? Can/should science focus solely on observables?  What about traits, emotions, unconscious???  Perhaps indirect observation (e.g., obs priming?)  Alternatives? ….Triangulation (using multiple assessments) 
Scientific control  control groups, holding things constant, balancing order effects, laboratory work, manipulations. The intent is to rule out alternative influences, but it’s difficult/impossible to implement complete control, there is always some error, we can’t control everything. 
Precision  repeatable, reliable, in getting the same measurement each time. 
Accurate  capable of providing a correct reading or measurement 
Operational definition  the precise specification of the procedures used in our experimentation. E.g, how are you are measuring something. Ie., Shyness: amount of eye contact, number of words muttered, number of children they associate with. 2.) The specific, observable, concrete steps  the recipes  that are involved in measuring or manipulating the concepts being study. 
Limits of Operational Definitions  the construct may not match up with the operational definition. For instance, is “eye contact” the same as “shyness”? 
Honesty/truthfulness  falsification of data/misrepresentation; one of the cardinal sins of science. 
Critical/skeptical  information must be disconfirmable; example, Facilitated Communication for individuals with autism; there wasn’t enough testing and skepticism. 
Curiosity/openness  openness for new ideas; potential conflict with skepticism 
Serendipity  an aptitude for making desirable discoveries by accident; luck. 
Prevailing theory  might constrain openness; a theory that is widely accepted 
Parsimony  the use of the simplest or most frugal route of explanation available. 
Abstractness  a focus on variables, and not particular situations, instances, or examplars; search for generalities, common misunderstanding: science is often not aimed at understanding particulars; role of theory. 
Determinism  finding the initiator to a cause; science searches for orderly causes; for predictability 
Neutrality/objectivity  sponsorship or vested interested may skew this; desire to minimize bias 
“Publicness”  releasing findings to the public for scrutiny (public scrutiny) 
Peer review  criticism by other scientists 
Public ownership  patenting, private investments in science, university research parks 
Cumulative/sequential  examination of long term trends, impact of the literature/theory; selfcorrecting, metaanalyses 
Rational/logical  reason behind an argument; scientific arguments are often strengthened if interwoven with another theory, other rational explanations; can be difficult to evaluate – alternatives often seem logical as well. 
Testable  theory that can be applied; “show me attitude” 
Basic science  broad research on a topic 
Applied science  very narrow, specific research on a topic; Eg., curing AIDS or cancer 
Primary sources  source(s) closest to the information being studied: research journals (thousands of APA journals), conference reports/presentations/poster sessions, review articles, books, handbooks, metaanalysis (time is an important factor: how uptodate is the research?) 
Review article  academic publishing: psych bulletin, psych review, annual review of psychology 
Metaanalysis  statistical combination of a set of related studies; compilation of data and its analysis 
Box score tactic  count of “successes”, each success of the brand of the experiment 
Effect size (d, g)  magnitude of treatment effect in standard deviation units 
Cohen’s d  unweighted; an effect size used to indicate the standardized difference between two means; use d when studies composing the metaanalysis primarily report ANOVAs and ttests comparisons between groups. 
Hedge’s g  weighted; pools using n – 1 (standard deviation) for each sample instead of n, which provides a better estimate, especially the smaller the sample sizes; somewhat more accurate version of Cohen’s d 
Mean effect size  the difference between two groups, in the form of a mean 
Subgroup analysis  refers to looking for pattern in a subset of the subjects 
File drawer problem  difficulty getting published; reasons why something may not get published: legitimate “nonsignificance”, underpowered studies, unpopular – against prevailing view 
TopDown Theorizing  a broad theory; less emphasis on data 
BottomUp Theorizing  large scale collection of data; frequently narrowly focused “minitheories” 
Descriptive statistics  describe the main features of a collection of data quantitatively; are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures; it’s opposed with inferential statistics, which tries to reach conclusions that go beyond the immediate data alone 
Sample  representative subset of the population 
Population  a set of entities which statistical inferences are to be drawn 
Statistic  measure from a sample (sample mean, sample deviation) (English) 
Parameter  measure from the population (Greek) 
Standard deviation  square root of variance (+/ 3) 
Variance  The average of the squared differences from the Mean. (to find ___, first calculate the mean, take each difference from the mean [if person 1 had 400 and the mean is 450, the difference is 50], square it, then average the result) 
Z Score  standardized score; representing so many standard deviations plus or minus 
Null hypothesis  nothing happened; status quo; treatment failed 
Sampling distribution  distribution of statistics (measure of the sample) – the probability that something happened by chance 
Central Limit Theorem  states conditions under which the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed; specifies nature of the sampling distribution; don’t have to empirically derive sampling distribution For sampling distrib of differ. betw means Middle: zero Variability: standard error (sd for sampling dist) Shape: normal (almost always) 
Standard error  the standard deviation of those sample means over all possible samples (of a given size) drawn from the population. Secondly, the standard error of the mean can refer to an estimate of that standard deviation 
Shape of sampling distribution  differences expected to occur by chance (without treatment) 
Significance  the chances that the experiment worked, not due to chance alone 
P<.05  The standard level of significance used to justify a claim of a statistically significant effect 
Type I error  claiming that a treatment worked, when in fact it did not; probability set by experimenter – set probability very low (e.g. 5%); p value reports these odds (p < .05); alpha 
Type II error  failing to detect a legitimate treatment effect; a false negative 
Alpha  the probability that you will wrongly reject the null hypothesis. This is also referred to as a false positive. 
Beta  the probability that you will wrongly retain the null hypothesis (false negative). 
Power  probability that the test will reject a false null hypothesis (i.e. that it will not make a Type II error). 
Increasing Power  raising the sampling size, increasing the effect size, changing the significance criterion 
Underpowered Research  research that is more to create a Type II error; failing to detect a legitimate treatment effect 
Influences on power  n, alpha, s, tx intensity, magnitude to be detected, directionality, independent/dependent groups. 
Directional test  onetailed test; likewise a nondirectional test is a twotailed test. 
Independent Groups  test differences between groups 
Dependent Groups  test differences within the same group 
Homogeneity of variance  The assumption of ____ is that the variance within each of the populations is equal. This is an assumption of analysis of variance (ANOVA). ANOVA works well even when this assumption is violated except in the case where there are unequal numbers of subjects in the various groups. If the variances are not homogeneous, they are said to be heterogeneous. 
Repeated measures  refers to studies in which the same measures are collected multiple times for each subject but under different conditions. For instance, repeated measures are collected in a longitudinal study in which change over time is assessed. Other studies compare the same measure under two or more different conditions 
ANOVA  (analysis of variance); intent – comparing multiple means; partitioning of variance 
Partitioning of variance (3 models)  ANOVA: Independent Groups F = Treatment Variable /Error Variable ANOVA: 2Way (Independent) ANOVA: Repeated Measures (Dependent Groups) 
Treatment variance  between groups variance; not a pure measure of treatment effects, affected both by random error and treatment effects. 
Error variance  within groups variance; variability not due to the treatment, but due to random error. That is, differences within a treatment group can’t be due to the treatment because everyone in the group is getting the same treatment. 
F ratio (sampling distribution)  The ___ is used to determine whether the variances in two independent samples are equal. If the ____ is not statistically significant, you may assume there is homogeneity of variance and employ the standard ttest for the difference of means. treatment effect: differences between means / Variability within groups (error) Treatment Variance / Error Variance 
Main effect  The effect of an independent variable on a dependent variable averaging across averaging across the levels of any independent variables 
Interaction  A kind of action that occurs as two or more variables have an effect upon one another 
Subject variance  used as a measure of how far a set of numbers are spread out from each other 
Post hoc tests  usually refers to a statistical test that has been performed after an ANOVA has obtained a significant effect for a factor. Because the ANOVA says only that at least two of the groups differ from one another, ____ are performed to find out which groups differ from one another. 
Alpha inflation  risk of a Type I error 
Tukey test  a singlestep multiple comparison procedure and statistical test generally used in conjunction with an ANOVA to find which means are significantly different from one another. 
Bonferroni adjustment  Take .05 / 3 to raise threshold needed for significance. SPSS does not do this. Very conservative. If there is significance here, you can find anything significant 
Pearson r  Best when variables are on a continuum; measure of correlation between variables X and Y, giving a value between +1 and 1 inclusive. 
Continuous data  numerical data which can hold any value. For instance, human height is continuous. There is no set of allowable heights. It can be any number. In contrast, human sex (i.e male or female) is categorical data, because there are only two possible values. 
Measures of magnitude: d, r2, eta2  – Effect size (d) – r squared (r2): proportion of variance accounted for – Eta squared (η2): Proportion for each piece of the pie – Partial eta squared (η2) Proportion of variance accounted for once other known sources are removed 
Publication bias  the tendency of researchers, editors, and pharmaceutical companies to handle the reporting of experimental results that are positive (i.e. showing a significant finding) differently from results that are negative (i.e. supporting the null hypothesis) or inconclusive, leading to bias in the overall published literature. 
Construct validity  refers to whether a scale measures or correlates with the theorized psychological scientific construct (e.g., "fluid intelligence") that it purports to measure 
Internal validity  The degree to which a study establishes that a factor causes a difference in behavior. If a study lacks internal validity, the researcher may falsely believe that a factor causes an effect when it really doesn’t. 
External validity  The degree to which the results of a study can be generalized to other participants, settings, and times. 
Statistical validity  refers to whether a statistical study is able to draw conclusions that are in agreement with statistical and scientific laws. 
Induction  bottomup; making a specific observation and applying them to broader generalizations and theories 
Deduction  topdown; make a broad theory to a more specific theory; Deductive arguments are attempts to show that a conclusion necessarily follows from a set of premises or hypotheses. 
Constructs  a mental state such as love, intelligence, hunger, and aggression that cannot be directly observed or manipulated with our present technology. 
Moderators  variable that can intensify, weaken, or reverse the effects of another variable. For example, the effect of wearing perfume may be moderated by gender: if you are a woman, wearing perfume may make you more liked; if you are a man, wearing perfume may make you less liked. 
Mediators  variables inside the individual (such as thoughts, feelings, or physiological responses) that come between a stimulus and a response. In other words, the stimulus has its effect because it causes changes in mediating variable, which, in turn, cause changes in behavior. 
Generalization  concept that is an extension of the concept to lessspecific criteria. It is a foundational element of logic and human reasoning 
Testability  ability to investigate an hypothesis; vague statements might be untestable; broad statements that cannot be proven wrong may be useless 
Kuhn  Science undergoes periodic “paradigm shifts” instead of progressing in a linear and continuous way These paradigm shifts open up new approaches to understanding that scientists would never have considered valid before Scientists can never divorce their subjective perspective from their work; thus, our comprehension of science can never rely on full "objectivity"  we must account for subjective perspectives as well 
Reliability  a general term, often referring to the degree to which a participant would get the same score if retested (testretest reliability). Reliability can, however, refer to the degree to which scores are free from random error. A measure can be ___, but not valid. However, a measure cannot be valid if it is not also ___. a general term, often referring to the degree to which a participant would get the same score if retested (testretest reliability). Reliability can, however, refer to the degree to which scores are free from random error. A measure can be ___, but not valid. However, a measure cannot be valid if it is not also ___. 
Validity  a reference to whether a conclusion or claim is justified. 
Error  the contrast of between the true values of a definition 
Bias  systematic errors that can push the scores in a given direction. Bias may to “finding” the results that the researcher wanted. 
Control tactics  removing variables that may interfere with the experiment 
Front 
Back 


Conflicts re the nature of science  media: “commercialization” of science, heuristics used in advertising, using and abusing science, courts: expert testimony (delivered with a twist), role of eyewitnesses, government and authorities: global warming, the politicizing of science  
Science as attitudes  an approach/orientation; various “ways of knowing”  
Empiricism  emphasizes the role of experience and evidence, especially sensory perception, in the formation of ideas, over the notion of innate ideas or tradition in contrast to  
Limits of empiricism  What if observations are distorted? (can’t directly observe thinking, for example.)  Accuracy? Selectivity? Bias? Can/should science focus solely on observables?  What about traits, emotions, unconscious???  Perhaps indirect observation (e.g., obs priming?)  Alternatives? ….Triangulation (using multiple assessments)  
Scientific control  control groups, holding things constant, balancing order effects, laboratory work, manipulations. The intent is to rule out alternative influences, but it’s difficult/impossible to implement complete control, there is always some error, we can’t control everything.  
Precision  repeatable, reliable, in getting the same measurement each time.  
Accurate  capable of providing a correct reading or measurement  
Operational definition  the precise specification of the procedures used in our experimentation. E.g, how are you are measuring something. Ie., Shyness: amount of eye contact, number of words muttered, number of children they associate with. 2.) The specific, observable, concrete steps  the recipes  that are involved in measuring or manipulating the concepts being study.  
Limits of Operational Definitions  the construct may not match up with the operational definition. For instance, is “eye contact” the same as “shyness”?  
Honesty/truthfulness  falsification of data/misrepresentation; one of the cardinal sins of science.  
Critical/skeptical  information must be disconfirmable; example, Facilitated Communication for individuals with autism; there wasn’t enough testing and skepticism.  
Curiosity/openness  openness for new ideas; potential conflict with skepticism  
Serendipity  an aptitude for making desirable discoveries by accident; luck.  
Prevailing theory  might constrain openness; a theory that is widely accepted  
Parsimony  the use of the simplest or most frugal route of explanation available.  
Abstractness  a focus on variables, and not particular situations, instances, or examplars; search for generalities, common misunderstanding: science is often not aimed at understanding particulars; role of theory.  
Determinism  finding the initiator to a cause; science searches for orderly causes; for predictability  
Neutrality/objectivity  sponsorship or vested interested may skew this; desire to minimize bias  
“Publicness”  releasing findings to the public for scrutiny (public scrutiny)  
Peer review  criticism by other scientists  
Public ownership  patenting, private investments in science, university research parks  
Cumulative/sequential  examination of long term trends, impact of the literature/theory; selfcorrecting, metaanalyses  
Rational/logical  reason behind an argument; scientific arguments are often strengthened if interwoven with another theory, other rational explanations; can be difficult to evaluate – alternatives often seem logical as well.  
Testable  theory that can be applied; “show me attitude”  
Basic science  broad research on a topic  
Applied science  very narrow, specific research on a topic; Eg., curing AIDS or cancer  
Primary sources  source(s) closest to the information being studied: research journals (thousands of APA journals), conference reports/presentations/poster sessions, review articles, books, handbooks, metaanalysis (time is an important factor: how uptodate is the research?)  
Review article  academic publishing: psych bulletin, psych review, annual review of psychology  
Metaanalysis  statistical combination of a set of related studies; compilation of data and its analysis  
Box score tactic  count of “successes”, each success of the brand of the experiment  
Effect size (d, g)  magnitude of treatment effect in standard deviation units  
Cohen’s d  unweighted; an effect size used to indicate the standardized difference between two means; use d when studies composing the metaanalysis primarily report ANOVAs and ttests comparisons between groups.  
Hedge’s g  weighted; pools using n – 1 (standard deviation) for each sample instead of n, which provides a better estimate, especially the smaller the sample sizes; somewhat more accurate version of Cohen’s d  
Mean effect size  the difference between two groups, in the form of a mean  
Subgroup analysis  refers to looking for pattern in a subset of the subjects  
File drawer problem  difficulty getting published; reasons why something may not get published: legitimate “nonsignificance”, underpowered studies, unpopular – against prevailing view  
TopDown Theorizing  a broad theory; less emphasis on data  
BottomUp Theorizing  large scale collection of data; frequently narrowly focused “minitheories”  
Descriptive statistics  describe the main features of a collection of data quantitatively; are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures; it’s opposed with inferential statistics, which tries to reach conclusions that go beyond the immediate data alone  
Sample  representative subset of the population  
Population  a set of entities which statistical inferences are to be drawn  
Statistic  measure from a sample (sample mean, sample deviation) (English)  
Parameter  measure from the population (Greek)  
Standard deviation  square root of variance (+/ 3)  
Variance  The average of the squared differences from the Mean. (to find ___, first calculate the mean, take each difference from the mean [if person 1 had 400 and the mean is 450, the difference is 50], square it, then average the result)  
Z Score  standardized score; representing so many standard deviations plus or minus  
Null hypothesis  nothing happened; status quo; treatment failed  
Sampling distribution  distribution of statistics (measure of the sample) – the probability that something happened by chance  
Central Limit Theorem  states conditions under which the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed; specifies nature of the sampling distribution; don’t have to empirically derive sampling distribution For sampling distrib of differ. betw means Middle: zero Variability: standard error (sd for sampling dist) Shape: normal (almost always)  
Standard error  the standard deviation of those sample means over all possible samples (of a given size) drawn from the population. Secondly, the standard error of the mean can refer to an estimate of that standard deviation  
Shape of sampling distribution  differences expected to occur by chance (without treatment)  
Significance  the chances that the experiment worked, not due to chance alone  
P<.05  The standard level of significance used to justify a claim of a statistically significant effect  
Type I error  claiming that a treatment worked, when in fact it did not; probability set by experimenter – set probability very low (e.g. 5%); p value reports these odds (p < .05); alpha  
Type II error  failing to detect a legitimate treatment effect; a false negative  
Alpha  the probability that you will wrongly reject the null hypothesis. This is also referred to as a false positive.  
Beta  the probability that you will wrongly retain the null hypothesis (false negative).  
Power  probability that the test will reject a false null hypothesis (i.e. that it will not make a Type II error).  
Increasing Power  raising the sampling size, increasing the effect size, changing the significance criterion  
Underpowered Research  research that is more to create a Type II error; failing to detect a legitimate treatment effect  
Influences on power  n, alpha, s, tx intensity, magnitude to be detected, directionality, independent/dependent groups.  
Directional test  onetailed test; likewise a nondirectional test is a twotailed test.  
Independent Groups  test differences between groups  
Dependent Groups  test differences within the same group  
Homogeneity of variance  The assumption of ____ is that the variance within each of the populations is equal. This is an assumption of analysis of variance (ANOVA). ANOVA works well even when this assumption is violated except in the case where there are unequal numbers of subjects in the various groups. If the variances are not homogeneous, they are said to be heterogeneous.  
Repeated measures  refers to studies in which the same measures are collected multiple times for each subject but under different conditions. For instance, repeated measures are collected in a longitudinal study in which change over time is assessed. Other studies compare the same measure under two or more different conditions  
ANOVA  (analysis of variance); intent – comparing multiple means; partitioning of variance  
Partitioning of variance (3 models)  ANOVA: Independent Groups F = Treatment Variable /Error Variable ANOVA: 2Way (Independent) ANOVA: Repeated Measures (Dependent Groups)  
Treatment variance  between groups variance; not a pure measure of treatment effects, affected both by random error and treatment effects.  
Error variance  within groups variance; variability not due to the treatment, but due to random error. That is, differences within a treatment group can’t be due to the treatment because everyone in the group is getting the same treatment.  
F ratio (sampling distribution)  The ___ is used to determine whether the variances in two independent samples are equal. If the ____ is not statistically significant, you may assume there is homogeneity of variance and employ the standard ttest for the difference of means. treatment effect: differences between means / Variability within groups (error) Treatment Variance / Error Variance  
Main effect  The effect of an independent variable on a dependent variable averaging across averaging across the levels of any independent variables  
Interaction  A kind of action that occurs as two or more variables have an effect upon one another  
Subject variance  used as a measure of how far a set of numbers are spread out from each other  
Post hoc tests  usually refers to a statistical test that has been performed after an ANOVA has obtained a significant effect for a factor. Because the ANOVA says only that at least two of the groups differ from one another, ____ are performed to find out which groups differ from one another.  
Alpha inflation  risk of a Type I error  
Tukey test  a singlestep multiple comparison procedure and statistical test generally used in conjunction with an ANOVA to find which means are significantly different from one another.  
Bonferroni adjustment  Take .05 / 3 to raise threshold needed for significance. SPSS does not do this. Very conservative. If there is significance here, you can find anything significant  
Pearson r  Best when variables are on a continuum; measure of correlation between variables X and Y, giving a value between +1 and 1 inclusive.  
Continuous data  numerical data which can hold any value. For instance, human height is continuous. There is no set of allowable heights. It can be any number. In contrast, human sex (i.e male or female) is categorical data, because there are only two possible values.  
Measures of magnitude: d, r2, eta2  – Effect size (d) – r squared (r2): proportion of variance accounted for – Eta squared (η2): Proportion for each piece of the pie – Partial eta squared (η2) Proportion of variance accounted for once other known sources are removed  
Publication bias  the tendency of researchers, editors, and pharmaceutical companies to handle the reporting of experimental results that are positive (i.e. showing a significant finding) differently from results that are negative (i.e. supporting the null hypothesis) or inconclusive, leading to bias in the overall published literature.  
Construct validity  refers to whether a scale measures or correlates with the theorized psychological scientific construct (e.g., "fluid intelligence") that it purports to measure  
Internal validity  The degree to which a study establishes that a factor causes a difference in behavior. If a study lacks internal validity, the researcher may falsely believe that a factor causes an effect when it really doesn’t.  
External validity  The degree to which the results of a study can be generalized to other participants, settings, and times.  
Statistical validity  refers to whether a statistical study is able to draw conclusions that are in agreement with statistical and scientific laws.  
Induction  bottomup; making a specific observation and applying them to broader generalizations and theories  
Deduction  topdown; make a broad theory to a more specific theory; Deductive arguments are attempts to show that a conclusion necessarily follows from a set of premises or hypotheses.  
Constructs  a mental state such as love, intelligence, hunger, and aggression that cannot be directly observed or manipulated with our present technology.  
Moderators  variable that can intensify, weaken, or reverse the effects of another variable. For example, the effect of wearing perfume may be moderated by gender: if you are a woman, wearing perfume may make you more liked; if you are a man, wearing perfume may make you less liked.  
Mediators  variables inside the individual (such as thoughts, feelings, or physiological responses) that come between a stimulus and a response. In other words, the stimulus has its effect because it causes changes in mediating variable, which, in turn, cause changes in behavior.  
Generalization  concept that is an extension of the concept to lessspecific criteria. It is a foundational element of logic and human reasoning  
Testability  ability to investigate an hypothesis; vague statements might be untestable; broad statements that cannot be proven wrong may be useless  
Kuhn  Science undergoes periodic “paradigm shifts” instead of progressing in a linear and continuous way These paradigm shifts open up new approaches to understanding that scientists would never have considered valid before Scientists can never divorce their subjective perspective from their work; thus, our comprehension of science can never rely on full "objectivity"  we must account for subjective perspectives as well  
Reliability  a general term, often referring to the degree to which a participant would get the same score if retested (testretest reliability). Reliability can, however, refer to the degree to which scores are free from random error. A measure can be ___, but not valid. However, a measure cannot be valid if it is not also ___. a general term, often referring to the degree to which a participant would get the same score if retested (testretest reliability). Reliability can, however, refer to the degree to which scores are free from random error. A measure can be ___, but not valid. However, a measure cannot be valid if it is not also ___.  
Validity  a reference to whether a conclusion or claim is justified.  
Error  the contrast of between the true values of a definition  
Bias  systematic errors that can push the scores in a given direction. Bias may to “finding” the results that the researcher wanted.  
Control tactics  removing variables that may interfere with the experiment 
© Copyright 2019 , Koofers, Inc. All rights reserved.
The information provided on this site is protected by U.S. and International copyright law, and other applicable intellectual property laws, including laws covering data access and data compilations. This information is provided exclusively for the personal and academic use of students, instructors and other university personnel. Use of this information for any commercial purpose, or by any commercial entity, is expressly prohibited. This information may not, under any circumstances, be copied, modified, reused, or incorporated into any derivative works or compilations, without the prior written approval of Koofers, Inc.