wqpsurveys.blogg.se - Is data mining and data dredging the same

However, if the newly formed hypotheses are chosen on the basis that the preliminary data suggest that there is something there, then you really cannot 'test' the new hypothesis using those same data, just as you suspect. Testing lots of hypotheses is not a problem in a preliminary study, and forming new hypotheses from those tests is good. Unless you can find an existing sequential test that has been developed for this purpose (and I'm not aware of one) you would either need to undertake some heavy theoretical work to develop one, or else write off this comparison and test it with new data instead. I wouldn't rule out the possibility that some clever statistician could come up with a testing sequence that properly adjusts for this, but it would be a difficult theoretical exercise which would probably constitute a publishable paper in its own right. In such circumstances, it is difficult to "adjust" the second test to take account of the first test, and it would require some heavy theoretical development.

In practice, the conditional null distribution for the second test would be complicated, because it is conditional on the outcome of an optimisation result in the first test involving multiple variables that are related to the variables in the second test. (Indeed, there is a plausible causal relationship between these variables, which could be quite strong.) Consequently, conditional on the result of the first test, the null distribution of the second test would not be the same as if the first test had not been performed to get there. There is good reason to believe that the presence or absence of antibodies to the bacteria would affect the association between the bacteria variable and the sickness outcome in the first test.

That is certainly going to be required in this case, and it will not be easy.

As a general rule, when we "adjust for multiple comparisons" we are essentially adjusting the null distribution of a statistical test to condition on all the testing coming before/concurrently with that test. The second test you mention sounds very suspicious in this context, and my view is that it would not be appropriate to test this without a further adjustment for multiple comparisons (which would be extremely complicated and possibly prohibitive).