Laboratory Experiments in Information Retrieval Laboratory Experiments in Information Retrieval

Laboratory Experiments in Information Retrieval

Sample Sizes, Effect Sizes, and Statistical Power

    • $39.99
    • $39.99

Publisher Description

Covering aspects from principles and limitations of statistical significance tests to topic set size design and power analysis, this book guides readers to statistically well-designed experiments. Although classical statistical significance tests are to some extent useful in information retrieval (IR) evaluation, they can harm research unless they are used appropriately with the right sample sizes and statistical power and unless the test results are reported properly. The first half of the book is mainly targeted at undergraduate students, and the second half is suitable for graduate students and researchers who regularly conduct laboratory experiments in IR, natural language processing, recommendations, and related fields.
Chapters 1–5 review parametric significance tests for comparing system means, namely, t-tests and ANOVAs, and show how easily they can be conducted using Microsoft Excel or R. These chapters also discuss a few multiple comparison procedures for researchers who are interested in comparing every system pair, including a randomised version of Tukey's Honestly Significant Difference test. The chapters then deal with known limitations of classical significance testing and provide practical guidelines for reporting research results regarding comparison of means.

Chapters 6 and 7 discuss statistical power. Chapter 6 introduces topic set size design to enable test collection builders to determine an appropriate number of topics to create. Readers can easily use the author’s Excel tools for topic set size design based on the paired and two-sample t-tests, one-way ANOVA, and confidence intervals. Chapter 7 describes power-analysis-based methods for determining an appropriate sample size for a new experiment based on a similar experiment done in the past, detailing how to utilize the author’s R tools for power analysis and how to interpret the results. Case studies from IR for both Excel-based topic set size design and R-based power analysis are also provided.

GENRE
Computers & Internet
RELEASED
2018
September 22
LANGUAGE
EN
English
LENGTH
159
Pages
PUBLISHER
Springer Nature Singapore
SELLER
Springer Nature B.V.
SIZE
27.7
MB

More Books Like This

Statistics for Data Scientists Statistics for Data Scientists
2022
Multivariate Statistical Quality Control Using R Multivariate Statistical Quality Control Using R
2012
Combining Soft Computing and Statistical Methods in Data Analysis Combining Soft Computing and Statistical Methods in Data Analysis
2010
Algorithmic Learning in a Random World Algorithmic Learning in a Random World
2022
Integrated Uncertainty in Knowledge Modelling and Decision Making Integrated Uncertainty in Knowledge Modelling and Decision Making
2022
Statistics and Data Science Statistics and Data Science
2020

More Books by Tetsuya Sakai

Evaluating Information Retrieval and Access Tasks Evaluating Information Retrieval and Access Tasks
2020
Information Retrieval Technology Information Retrieval Technology
2018
String Processing and Information Retrieval String Processing and Information Retrieval
2016
Information Retrieval Technology Information Retrieval Technology
2009
Information Retrieval Technology Information Retrieval Technology
2008