Download citation...
Question
Asked 28th Jun, 2014

When is it justifiable to exclude 'outlier' data points from statistical analyses?

Data analyzers inspecting tables or figures might decide to exclude from statistical analyses unusual data points sometimes called 'outlier' data points. Statistical patterns and conclusions might differ between analyses including versus excluding outliers.
The exact underlying mechanisms that create outlier data points are often unknown. People might always find arguments to exclude or keep data in analyses. How important is familiarity with model species or model systems in the justification of data point selection, or the definition of statistical rules in general?
Marcel M. Lambrechts
Centre d'Ecologie Fonctionnelle et Evolutive
Dear Linus,
I fully agree. We use similar procedures of judgement before the data are entered in the long-term data base of small passerine birds. When the values are extreme based on >30 years of observation, they are not included in the data base. For instance, a great tit has a wing length of 70-78 mm (no values below 70 mm). If in the note books there is written '66 mm' this must have been a writing error and therefore not considered. On the other hand, there may indeed be border cases and there may be exchange between populations differing in phenotype, like wing length... .
Thus, how many 'outlier data points' that are found in field note books will not end up in electronic data bases, and how do data managers varying in background information about model species handle these special cases when the files are constructed?
4 Recommendations