• 工作总结
  • 工作计划
  • 心得体会
  • 述职报告
  • 思想汇报
  • 发言讲话稿
  • 演讲稿
  • 申请书
  • 读后感
  • 报告材料
  • 策划方案
  • 当前位置: 写作资料库 > 其他范文 > 正文

    概率论毕业论文外文翻译(适用于毕业论文外文翻译+中英文对照)

    时间:2021-04-13 11:20:36 来源:写作资料库 本文已影响 写作资料库手机站

      1 -

     Statistical hypothesis testing

     Adriana Albu,Loredana Ungureanu

     Politehnica University Timisoara, adrianaa@aut.utt.ro

     Politehnica University Timisoara, loredanau@aut.utt.ro

     Abstract In this article, we present a Bayesian statistical hypothesis testing inspection, testing theory and the process Mentioned hypothesis testing in the real world and the importance of, and successful test of the Notes.

     Key words Bayesian hypothesis testing; \o "Bayesian inference" Bayesian inference; Test of significance

     Introduction

     A statistical hypothesis test is a method of making decisions using data, whether from a \o "Controlled experiment" controlled experiment or an \o "Observational study" observational study (not controlled). In \o "Statistics" statistics, a result is called \o "Statistically significant" statistically significant if it is unlikely to have occurred by \o "Luck" chance alone, according to a pre-determined threshold probability, the \o "Significance level" significance level. The phrase "test of significance" was coined by \o "Ronald Fisher" Ronald Fisher: "Critical tests of this kind may be called tests of significance, and when such tests are available we may discover whether a second sample is or is not significantly different from the first." [1]

     Hypothesis testing is sometimes called confirmatory data analysis, in contrast to \o "Exploratory data analysis" exploratory data analysis. In \o "Frequency probability" frequency probability, these decisions are almost always made using \o "Null-hypothesis" null-hypothesis tests. These are tests that answer the question Assuming that the null hypothesis is true, what is the probability of observing a value for the test statistic that is at least as extreme as the value that was actually observed?) 2 [] More formally, they represent answers to the question, posed before undertaking an experiment, of what outcomes of the experiment would lead to rejection of the null hypothesis for a pre-specified probability of an incorrect rejection. One use of hypothesis testing is deciding whether experimental results contain enough information to cast doubt on conventional wisdom.

     Statistical hypothesis testing is a key technique of \o "Frequency probability" frequentist \o "Statistical inference" statistical inference. The Bayesian approach to hypothesis testing is to base rejection of the hypothesis on the \o "Posterior probability" posterior probability. [3] [4] Other approaches to reaching a decision based on data are available via \o "Decision theory" decision theory and \o "Optimal decision" optimal decisions.

     The critical region of a \o "Hypothesis test" hypothesis test is the set of all outcomes which cause the \o "Null hypothesis" null hypothesis to be rejected in favor of the \o "Alternative hypothesis" alternative hypothesis. The critical region is usually denoted by the letter C.

     One-sample tests are appropriate when a sample is being compared to the population from a hypothesis. The population characteristics are known from theory or are calculated from the population.

     Two-sample tests are appropriate for comparing two samples, typically experimental and control samples from a scientifically controlled experiment.

     Paired tests are appropriate for comparing two samples where it is impossible to control important variables. Rather than comparing two sets, members are paired between samples so the difference between the members becomes the sample. Typically the mean of the differences is then compared to zero.

     Z-tests are appropriate for comparing means under stringent conditions regarding normality and a known standard deviation.

     T-tests are appropriate for comparing means under relaxed conditions (less is assumed).

     Tests of proportions are analogous to tests of means (the 50% proportion).

     Chi-squared tests use the same calculations and the same probability distribution for different applications:

      \o "Chi-squared test" Chi-squared tests for variance are used to determine whether a normal population has a specified variance. The null hypothesis is that it does.

     Chi-squared tests of independence are used for deciding whether two variables are associated or are independent. The variables are categorical rather than numeric. It can be used to decide whether \o "Left-handedness" left-handedness is correlated with libertarian politics (or not). The null hypothesis is that the variables are independent. The numbers used in the calculation are the observed and expected frequencies of occurrence (from \o "Contingency table" contingency tables).

     Chi-squared goodness of fit tests are used to determine the adequacy of curves fit to data. The null hypothesis is that the curve fit is adequate. It is common to determine curve shapes to minimize the mean square error, so it is appropriate that the goodness-of-fit calculation sums the squared errors.

      \o "F-test" F-tests (analysis of variance, ANOVA) are commonly used when deciding whether groupings of data by category are meaningful. If the variance of test scores of the left-handed in a class is much smaller than the variance of the whole class, then it may be useful to study lefties as a group. The null hypothesis is that two variances are the same - so the proposed grouping is not meaningful.

     The testing process

     In the statistical literature, statistical hypothesis testing plays a fundamental role. The usual line of reasoning is as follows:

     There is an initial research hypothesis of which the truth is unknown.

     The first step is to state the relevant null and alternative hypotheses. This is important as mis-stating the hypotheses will \o "Garbage In, Garbage Out" muddy the rest of the process. Specifically, the null hypothesis allows attaching an attribute: it should be chosen in such a way that it allows us to conclude whether the alternative hypothesis can either be accepted or stays undecided as it was before the test. [9]

     The second step is to consider the \o "Statistical assumption" statistical assumptions being made about the sample in doing the test; for example, assumptions about the \o "Statistical independence" statistical independence or about the form of the distributions of the observations. This is equally important as invalid assumptions will mean that the results of the test are invalid.

     Decide which test is appropriate, and state the relevant \o "Test statistic" test statistic T.

     Derive the distribution of the test statistic under the null hypothesis from the assumptions. In standard cases this will be a well-known result. For example the test statistic may follow a \o "Student's t distribution" Student's t distribution or a \o "Normal distribution" normal distribution.

     Select a significance level (α), a probability threshold below which the null hypothesis will be rejected. Common values are 5% and 1%.

     The distribution of the test statistic under the null hypothesis partitions the possible values of T into those for which the null-hypothesis is rejected, the so called critical region, and those for which it is not. The probability of the critical region is α.

     Compute from the observations the observed value tobs of the test statistic T.

     Decide to either fail to reject the null hypothesis or reject it in favor of the alternative. The decision rule is to reject the null hypothesis H0 if the observed value tobs is in the critical region, and to accept or "fail to reject" the hypothesis otherwise.

     Use and Importance

     Statistics are helpful in analyzing most collections of data. This is equally true of hypothesis testing which can justify conclusions even when no scientific theory exists. Real world applications of hypothesis testing include [7]:

     Testing whether more men than women suffer from nightmares

     Establishing authorship of documents

     Evaluating the effect of the full moon on behavior

     Determining the range at which a bat can detect an insect by echo

     Deciding whether hospital carpeting results in more infections

     Selecting the best means to stop smoking

     Checking whether bumper stickers reflect car owner behavior

     Testing the claims of handwriting analysts

     Statistical hypothesis testing plays an important role in the whole of statistics and in \o "Statistical inference" statistical inference. For example, Lehmann (1992) in a review of the fundamental paper by Neyman and Pearson (1933) says: "Nevertheless, despite their shortcomings, the new paradigm formulated in the 1933 paper, and the many developments carried out within its framework continue to play a central role in both the theory and practice of statistics and can be expected to do so in the foreseeable future".

     Significance testing has been the favored statistical tool in some experimental social sciences (over 90% of articles in the Journal of Applied Psychology during the early 1990s). [8] Other fields have favored the estimation of parameters. Editors often consider significance as a criterion for the publication of scientific conclusions based on experiments with statistical results.

     Cautions

     The successful hypothesis test is associated with a probability and a type-I error rate. The conclusion might be wrong.

     The conclusion of the test is only as solid as the sample upon which it is based. The design of the experiment is critical. A number of unexpected effects have been observed including:

     The \o "Clever Hans effect" Clever Hans effect. A horse appeared to be capable of doing simple arithmetic.

     The \o "Hawthorne effect" Hawthorne effect. Industrial workers were more productive in better illumination, and most productive in worse.

     The \o "Placebo effect" Placebo effect. Pills with no medically active ingredients were remarkably effective.

     A statistical analysis of misleading data produces misleading conclusions. The issue of data quality can be more subtle. In \o "Forecasting" forecasting for example, there is no agreement on a measure of forecast accuracy. In the absence of a consensus measurement, no decision based on measurements will be without controversy.

     The book \o "How to Lie with Statistics" How to Lie with Statistics is the most popular book on statistics ever published. [28] It does not much consider hypothesis testing, but its cautions are applicable, including: Many claims are made on the basis of samples too small to convince. If a report does not mention sample size, be doubtful.

     Hypothesis testing acts as a filter of statistical conclusions; Only those results meeting a probability threshold are publishable. Economics also acts as a publication filter; Only those results favorable to the author and funding source may be submitted for publication. The impact of filtering on publication is termed \o "Publication bias" publication bias. A related problem is that of \o "Multiple testing" multiple testing (sometimes linked to \o "Data mining" data mining), in which a variety of tests for a variety of possible effects are applied to a single data set and only those yielding a significant result are reported.

     Those making critical decisions based on the results of a hypothesis test are prudent to look at the details rather than the conclusion alone. In the physical sciences most results are fully accepted only when independently confirmed. The general advice concerning statistics is, "Figures never lie, but liars figure" (anonymous).

     Controversy

     Since significance tests were first popularized many objections have been voiced by prominent and respected statisticians. The volume of criticism and rebuttal has filled books with language seldom used in the scholarly debate of a dry subject. Much of the criticism was published more than 40 years ago. The fires of controversy have burned hottest in the field of experimental psychology. Nickerson surveyed the issues in the year 2000. He included 300 references and reported 20 criticisms and almost as many recommendations, alternatives and supplements. The following section greatly condenses Nickerson's discussion, omitting many issues.

     Results of the controversy

     The controversy has produced several results. The American Psychological Association has strengthened its statistical reporting requirements after review, [10] medical journal publishers have recognized the obligation to publish some results that are not statistically significant to combat publication bias. and a journal (Journal of Articles in Support of the Null Hypothesis) has been created to publish such results exclusively. Textbooks have added some cautions and increased coverage of the tools necessary to estimate the size of the sample required to produce significant results. Major organizations have not abandoned use of significance tests although they have discussed doing so.

     References

     [1] R. A. Fisher (1925). Statistical Methods for Research Workers, Edinburgh: Oliver and Boyd, 1925, p.43.

     [2] Cramer, Duncan; Dennis Howitt (2004). The Sage Dictionary of Statistics. p.?76. \o "International Standard Book Number" ISBN? \o "Special:BookSources/0-7619-4138-X" 0-7619-4138-X.?

     [3] Schervish, M (1996) Theory of Statistics, p. 218. Springer ISBN 0-387-94546-6

     [4] Kaye, David H.; Freedman, David A. (2011). "Reference Guide on Statistics". Reference manual on scientific evidence (3rd ed.). Eagan, MN Washington, D.C: West National Academies Press. p.?259. \o "International Standard Book Number" ISBN? \o "Special:BookSources/978-0-309-21421-6" 978-0-309-21421-6.

     [5] C. S. Peirce (August 1878). "Illustrations of the Logic of Science VI: Deduction, Induction, and Hypothesis". Popular Science Monthly 13.

     [6] \o "Ronald Fisher" Fisher, Sir Ronald A. (1956) [1935]. "Mathematics of a Lady Tasting Tea". In James Roy Newman. The World of Mathematics, volume 3 [Design of Experiments]. Courier Dover Publications. \o "International Standard Book Number" ISBN? \o "Special:BookSources/978-0-486-41151-4" 978-0-486-41151-4.

     [7] Box, Joan Fisher (1978). R.A. Fisher, The Life of a Scientist. New York: Wiley. p.?134. \o "International Standard Book Number" ISBN? \o "Special:BookSources/0-471-09300-9" 0-471-09300-9

     [8] Lehmann, E.L.; Romano, Joseph P. (2005). Testing Statistical Hypotheses (3E ed.). New York: Springer. \o "International Standard Book Number" ISBN? \o "Special:BookSources/0-387-98864-5" 0-387-98864-5.

     [9] Adèr,J.H. (2008). Chapter 12: Modelling. In H.J. Adèr & G.J. Mellenbergh (Eds.) (with contributions by D.J. Hand), Advising on Research Methods: A consultant's companion (pp. 183–209). Huizen, The Netherlands: Johannes van Kessel Publishing

     [10] Triola, Mario (2001). Elementary statistics (8 ed.). Boston: Addison-Wesley. p.?388. \o "International Standard Book Number" ISBN? \o "Special:BookSources/0-201-61477-4" 0-201-61477-4.

     American Journal of Mathematics, 2007, 126(5): 2387-2425

     统计假设检验

     Adriana Albu,Loredana Ungureanu

     Politehnica University Timisoara, adrianaa@aut.utt.ro

     Politehnica University Timisoara, loredanau@aut.utt.ro

     摘 要 在这篇文章中,我们给出统计假设检验的贝叶斯检验,介绍了检验理论和其过程。提及了假设检验在现实世界的一些应用和重要性,以及成功的检验的注意事项。

     关键词 贝叶斯假设检验;贝叶斯推理;显著性检验

     引言

     统计假设检验是一种利用数据做决策的方法,无论是在有控制的实验还是在没有控制的观察性研究中都有实用。在统计学中,如果一个结果不可能根据预先确定的阈值的概率,显著性水平,单独的发生,那么就说这个结果有统计学意义。那句“有意义的测试”是由罗纳德·费希尔所说的:“这种关键测试可能被称为有意义的测试,当这种测试是可接受的,并且我们可以发现另一个例子和第一个有显著性的不同。

      假设检验有时也被称为验证性数据分析 ,它与探索性数据分析相对而言。在频率的概率中,这些决定几乎总是用零假设检验。

     有些测试回答了这个问题 ,声称零假设是正确的,它是一个观测一个测试统计价值至少是一个是否确实被观测到的价值的概率。更普遍的,他们在进行实验之前对问题提出一个结论再根据实验的结果和一定的概率判断所推测的结论是否正确。假设检验的用途之一就是去决定实验的结果是否有足够得信息去怀疑传统的智慧。

      统计假设检验时概率统计涉及的关键技术,假设检验的贝叶斯方法是立足于拒绝后验概率的假设。其他的方法,通过决策理论和最优决策达到通过数据分析得出结论的目的。其他地区的假设检验的关键是赞成替代假说拒绝零假设的所有结果形成集合,通常有字母表示C临界域。

     介绍

     当一个样本正在同来自假设的人口对比时,单个样本测试是可取的,人口的特征通过理论可知或通过人口能够被计算。

     两个样本测试用于比较两个样本,通常科学的控制实验实验组和对照组样品。当不可能控制重要变量时,配对测试适用于比较两个样本。而不是比较两套,样本成员进行配对以至于成员之间的不同变成样本。通常情况下成员之间的差异相比为零。

     常态和已知标准差的条件下比较适合应用Z-测试。

     T-检验是适用于比较宽松的条件 下(较少假定)的手段。

     类似的测试手段(50%的比例)的测试。卡方检验,适用相同的计算和不同的应用程序相同的概率分布:

     卡方检验用于检验正常人群中是否有一个指定的方差,零假设就是这个方差。

     卡方独立性测试用于决定是否两个变量关联或者是独立的。零假设变量是独立的。在计算中使用观察的数据预计事件的发生频率。

     卡方检验用来确定适合数据的曲线充足。零假设是,曲线拟合是足够的。确定曲线形状,以尽量减少均方误差这是常见的。所以它是适当的好的拟合计算方差的方法。

     F检验(方差分析)是常用的,在决定是否按类别的数据分组是有意义的。如果左手中的一类考试成绩的差异是比全班方差小,那么它可能是有用的,零假设两个差异是相同的,因此,拟合的分组是没有意义的。

     测试的过程

     在统计学中统计假设检验做了一个基础的角色。通常的推理思路是下面这样的:

     有一个初步的研究假说,总体情况是未知的。

     第一步是去声明相关的零假设和被择假设。具体来说,零假设允许附加属性:应该选择这样一种方式,它可以让我们得出结论,是否可以被接受的替代假说或保持未定,因为它是在测试之前定下的。

     第二步是去考虑统计假说关于正在做的测试的统计假设的制定,举个例子,关于统计独立性的假设或关于观测值的分配形式的假设。具体的说,这是同样重要的因为无效的假设将意味着测试结果是无效的。

     决定哪个测试是适当的,说明有关的检验统计量T。

     从零假设的假设下得出的检验统计量的分布,在标准情况下,这将是一个众所周知的结果。检验统计量可以按照学生的t分布或正态分布。

     选择一个显著性水平(a),将拒绝零假设的概率置于他之下,一般选择5%和1%。

     零假设下统计检验的分布把T的可能值分布到零假设被拒绝的区域,这就是关键域,他不是T的可能值,临界域的概率是a.

     观测计算检验统计量T的观测值t。

     决定是否拒绝零假设接受被择假设。如果观测时值落在了临界域则拒绝零假设HO,接受或拒绝其他的假设。

     应用和重要性

     假设检验对于分析大部分的收集的数据是有帮助的。这同样是真正可以证明的结论,即使没有科学理论存在的假设检验。

     假设检验现实世界的应用包括:

     测试是否男性比女性更容易做恶梦。

     建立文件的著作权。

     评估满月对行为的影响。

     确定蝙蝠可以用回声捕捉昆虫的范围。

     确定是否医院的地毯导致了更多的感染。

     选择戒烟的最佳手段。

     检查是否保险杠贴纸反应车主的行为。

     测试笔迹分析师的索赔。

     统计假设检验在整个统计和统计推断中起着重要的作用。举个例子,莱曼(1992)在关于奈曼和Pearson(1933)的一篇基础文件的审查中说:“不过,尽管他们的缺点,一个新的典范在1993年的文件中形成,许多新的发展着利用它的框架继续在统计的理论和实践中发挥着中心作用,并可以期望在可预见的将来也会这样做。

     显著性检验时一直青睐的统计工具,在一些实验性的社会学(超过90%,在20世纪90年代初,在应用心里学杂志上的文章)等领域有利于参数的统计,编辑经常考虑出版基于实验的统计结果的科学结论出版的意义。

     注意事项

     成功的假设检验是与概率和第一类错误率相联系的,结论可能是错误的。检验的结论是基于它所使用的样本的,样本不同结果可能不同,这个设计是实验的核心,已观测到的一些意想不到的效果包括:

     聪明的汉斯效果。一匹马似乎是能够做简单的算术题。

     霍索恩效果。产业工人更多更好的照明生产,最糟糕的生产。

     安慰剂效应。没有医疗活性成分的药片是非常有效的。

     一个误导性的数据统计分析产生误导性的结论。数据质量问题,可以更加微妙。例如,在预测中,有没有协议的预报准确率的措施。在一个共识测量情况下,没有基于测量的决定是毫无争议的。

     这本书如何用统计说谎是曾经出版最流行的一本关于统计的书,它没有过多的考虑假设检验,但它的注意事项是适用的,包括:一些论断是在样本太小不能说服问题的情况下做出的,如果报告没提到样本大小是值得怀疑的。

     假设检验充当统计结论的过滤器,只有符合概率阀值的结果是能发布的。经济还充当出版物的过滤器,只有那些有利于作者和资金来源的结果可能会被提交出版。出版物过滤器的影响被称为出版偏见。一个相关的问题是多次测试(有时与数据挖掘相联系),各种测试各种产生的影响被应用到一个单独的数据集,仅仅那些有意义的结果能够被报道。

     基于假设检验结果的关键决策更着重细节的观察而不仅仅是结论本身。在物理科

     学中,大部分结果只有被独立证实时才能被完全接受。关于统计通常的建议是,数字不会说谎,但骗子会数字。

     争议

     由于显著性检验,首先由著名的受人尊敬的统计人员的反对而流行起来,批评和反驳量填补了少量的用于学术辩论的语言书籍。许多评论出版了超过40年。争论的火焰已经在实验心理学领域燃烧到最旺。尼克森在2000调查了这个问题。他包括300篇引用,报告了20篇评论,和差不多一样多得建议,替代和补充。以下的部分极大的凝聚了尼克森的讨论,省略了许多问题。

     争论的结果

     争议已经产生了一些成果。通过审查,美国心理协会已经加强了对统计报表的要求,医学杂志出版商已经意识到有义务出版一些没有统计学意义的结果以打击发表偏倚。期刊(杂志的文章支持零假设)已经建立专门公布这样的结果。教材增加了一些注意事项和增加覆盖必要的工具来估算产生所需样品的大小。主要组织没有放弃使用显著性检验,虽然 他们已经讨论了这样做。

     参考文献

     [1] R. A. Fisher (1925). Statistical Methods for Research Workers, Edinburgh: Oliver and Boyd, 1925, p.43.

     [2] Cramer, Duncan; Dennis Howitt (2004). The Sage Dictionary of Statistics. p.?76. \o "International Standard Book Number" ISBN? \o "Special:BookSources/0-7619-4138-X" 0-7619-4138-X.?

     [3] Schervish, M (1996) Theory of Statistics, p. 218. Springer ISBN 0-387-94546-6

     [4] Kaye, David H.; Freedman, David A. (2011). "Reference Guide on Statistics". Reference manual on scientific evidence (3rd ed.). Eagan, MN Washington, D.C: West National Academies Press. p.?259. \o "International Standard Book Number" ISBN? \o "Special:BookSources/978-0-309-21421-6" 978-0-309-21421-6.

     [5] C. S. Peirce (August 1878). "Illustrations of the Logic of Science VI: Deduction, Induction, and Hypothesis". Popular Science Monthly 13.

     [6] \o "Ronald Fisher" Fisher, Sir Ronald A. (1956) [1935]. "Mathematics of a Lady Tasting Tea". In James Roy Newman. The World of Mathematics, volume 3 [Design of Experiments]. Courier Dover Publications. \o "International Standard Book Number" ISBN? \o "Special:BookSources/978-0-486-41151-4" 978-0-486-41151-4.

     [7] Box, Joan Fisher (1978). R.A. Fisher, The Life of a Scientist. New York: Wiley. p.?134. \o "International Standard Book Number" ISBN? \o "Special:BookSources/0-471-09300-9" 0-471-09300-9

     [8] Lehmann, E.L.; Romano, Joseph P. (2005). Testing Statistical Hypotheses (3E ed.). New York: Springer. \o "International Standard Book Number" ISBN? \o "Special:BookSources/0-387-98864-5" 0-387-98864-5.

     [9] Adèr,J.H. (2008). Chapter 12: Modelling. In H.J. Adèr & G.J. Mellenbergh (Eds.) (with contributions by D.J. Hand), Advising on Research Methods: A consultant's companion (pp. 183–209). Huizen, The Netherlands: Johannes van Kessel Publishing

     [10] Triola, Mario (2001). Elementary statistics (8 ed.). Boston: Addison-Wesley. p.?388. \o "International Standard Bo