## Introduction

As part of an earlier project (SLP3), I formulated the null hypothesis and the alternative hypothesis with respect to the research question. Recall, in particular, that the null hypothesis postulates that travel has no effect on water consumption; while the alternative hypothesis assumed the opposite, that travel can influence (decrease or increase) the amount of water consumed. Thus, this hypothesis was non-directional because the goal was not to identify a specific effect of travel on hydration but rather to determine the significance of this effect in general. In general, correlation analysis could be used to determine the relationship between the variables, but this does not give an idea of the significance of this relationship, so it was decided to resort to parametric analysis using the ANOVA t-test.

## Sampling Method

A systematic probability sampling mechanism was used to generate a sample of 76 participants. This mechanism was used in order to reduce the probability of reaching errors and reduce data bias (McCombes, 2019). Thus, the names of all contacts I possess on social media were entered into MS Excel. Using =RAND(), each row was given a random value from 0.00 to 1.00, and then the data was sorted in ascending order. The first 76 values were selected to form the sample: they received a message asking them to participate in the study. Respondents were asked to specify how much travel they had had in the past year, which included going out of town for personal, work, study, or other purposes. They were also questioned about the amount of clean water in liters per day, averaged out.

## Test Statistics

An F-test statistic was used for the analysis, which traditionally shows the significance of differences between averages in the context of comparing cohorts. Mathematically, this parameter is defined as the ratio of the variance between, and it is the F-statistic from which the *p*-value can be calculated that helps determine whether to reject or accept the null hypothesis and thus, it is a significant predictor for the outcome of the statistical analysis (Zach, 2021). In this paper, cohorts will be formed in relation to the number of times respondents traveled out of town during the year. A total of four groups will be represented: 0 times, 1 time, 2-4 times, and more than 5 times. The presence of four groups and the continuity of the dependent variable confirms the choice of the ANOVA test for the analysis. Other types of parametric tests, including the t-test, may not be relevant because more than two groups are used for analysis.

## Example of Data

The Figure below shows a summary chart for all four groups of respondents according to how often they had traveled in the past year. In general, it can be seen that those who have traveled more than five times drink more water on average than those respondents who have traveled less. Including it can be seen that those who have not traveled out of town at all in the past year have the lowest trends in water consumption. Existing differences are not enough to postulate as significant. Performing a one-way ANOVA will help determine if the model is significant overall, and posterior tests can be used to determine which groups’ averages differ significantly.

## Conclusion and Error Estimates

In this study, the error could be measured using a significance score, namely the alpha level. For a parametric test, the significance level could be chosen lower (for instance,.01), but this would lead to an increase in second-order errors. In contrast, if a higher significance level (0.05) is chosen, then the probability of a second-order error will decrease, but the probability of a first-order error will increase. In either case, an estimate of 0.05 is the most common, so it can safely be used to reduce the error and create a more accurate statistical picture.

## References

McCombes, S. (2019). *Sampling methods | types, techniques & examples*. Scribbr.

Zach. (2021). *What does a high f value mean in ANOVA?* Statology.