An outlier is a data point whose value is significantly infrequent when compared to the frequency of other values. In uni-modal samples, "infrequent" may be defined as a given number of standard deviations (sigma) from the variable's mean. In multi-modal systems, an outlier may occur between modes or outside the higher and lower modes. In multivariate situations, it maybe some defined distance from all cluster centers. An outlier may be valid or erroneous data. Particular care should be taken before excluding any outliers unless they can be proven to be erroneous using the underlying knowledge about the data, its collection and it's application. The decision of inclusion/omission is a difficult one and includes the degree or probability of the data point being erroneous and the impact and risk of negative consequences of omitting each reading individually.

An extreme point that stands out from the rest of the distribution.

gathered data which are abnormal and are therefore excluded from the trend deriving from the interpolation function of the other gathered data.

For a set of numerical data, any value that is markedly smaller or larger than other values. For example, in the data set {3, 5, 4, 4, 6, 2, 25, 5, 6, 2} the value of 25 is an outlier.

One who does not fall within the norm; typically used in utilization. A provider who uses either too many services or too few services (for example, anyone whose utilization differs 2 standard deviations from the mean on a bell curve) is termed an outlier.

An observation that is many standard deviations from the mean. It is sometimes tempting to discard outliers, but this is imprudent unless the cause of the outlier can be identified, and the outlier is determined to be spurious. Otherwise, discarding outliers can cause one to underestimate the true variability of the measurement process.

One or more observations in a data set that are distant in value from the main body of the set. Outliers may come from a separate population or may result from sampling or recording errors.

A data point that differs significantly from other data for a similar phenomenon. For example, if the average sales for a product were ten units per month, and one month the product had sales of 500 units, this sales point might be considered an outlier.

An observed value so far removed from the normal distribution that it may be considered an abnormality or one-time event, and is often not included in future calculations based on that set of data.

A probable error in the data fields for acres treated and the pounds of pesticide used. To improve data quality, DPR developed a statistical method to detect probable errors in the database. Called the outlier program, this method calculates pesticide use rates (pounds of active ingredient applied divided by acres treated) that are then examined using a variety of statistical methods. The records with highly unlikely use rates (outliers) are identified, thereby serving to flag suspect pesticide use records. Errors occur, for example, when those reporting pesticide use shift decimal points during data entry. We used three different criteria to identify outliers by comparing each use rate with an estimate of the maximum rate for that type of use.

a data point that comes from a distribution different (in location, scale, or distributional form) from the bulk of the data

an observation that has a major effect on an estimate and which, because of its independently known atypical nature, needs special treatment

an unusual selection, one that differs from the other selections in one or more features

An individual having a value for a variable that differs substantially from those of most individuals. Values lying outside the fences are commonly considered to be outliers.

In phylogeny, an isolate which is only distantly related, or which lies outside the main cluster of isolates.

A datum which appears to deviate markedly from that for other members of the sample in which it occurs.

A data point that is outside a statistical range. Frequently refers to a treatment that is unusually long or expensive for its type.

A statistical term referring to isolated data values that fall well outside the range of values measured for nearly all other data points in a given set of observations.Various conventions exist for deciding when a given data value should be considered an outlier (e.g. if it falls 4 or more standard deviations above or below the mean for that set of observations). Outliers are sometimes indicative of data-entry errors or other types of artifacts, and they need to be rechecked carefully to determine whether they are valid entries before attempting to perform statistical analyses with the data summaries in which they occur.

A data value that does not (or is not thought to have) come from the typical population of data; in other words, a data value that falls outside the boundaries that enclose most other data values in the data.

A data value which is unusual with respect to the group of data in which it is found. It may be a single isolated value far away from all the others, or a value which does not follow the general pattern of the rest. Most classical statistical techniques tend to be quite sensitive to outliers, so that it is important to be on the alert for them. Graphical techniques, particularly residual plots, are very helpful in detecting the presence of outliers. Some of the newer Exploratory Data Analysis techniques and nonparametric procedures are much less sensitive to outliers (such procedures are said to be robust).

A data value that does not come from the typical population of data; in other words, extreme values. In a normal distribution, outliers are typically at least 3 standard deviations from the mean.

An extreme, or atypical, data value(s) in a sample. They should be considered carefully, before exclusion from analysis. For example, data values maybe recorded erroneously, and hence they may be corrected. However, in other cases they may just be surprisingly different, but not necessarily 'wrong'.

An extremely unrepresentative data point. A box-and-whisker plot identifies any data value less than -1.5*IQR or greater than +1.5*IQR as an outlier.

An "extreme" value that does not fit a normal (or bell-shaped) distribution curve. In other words, a value that is far away from the middle of all other values. Outlier values will contribute more to the mean than to the median, and therefore can skew results if means are being statistically compared.

Collective term used to refer to either a contaminant or a discordant observation (Beckman & Cook, 1983; Beckman RJ, Cook RD. Outlier Technometrics 1983; 25: 119 - 149 ). A discordant observation is any observation that appears surprising or discrepant to the investigator; a contaminant observation is any observation that is not realized from the target distribution (Beckman & Cook, 1983; Beckman RJ, Cook RD. Outlier....s. Technometrics 1983; 25: 119 - 149).

An extreme value in a data sample that is far from the central cluster of values in the sample. An outlier is extraordinary either because it is "far away" from the other values or because it fails to conform to the patter of the other values.

A data item whose value falls outside the bounds enclosing most of the other corresponding values in the sample. May indicate anomalous data. Should be examined carefully; may carry important information.

In statistics, an outlier is a single observation "far away" from the rest of the data. Statistics derived from data sets that include outliers will often be misleading. For example, if one is calculating the average temperature of 10 objects in a room, and most are between 20-25Â° Celsius, but an oven is at 350Â° C, the of the data may be 23 but the average temperature will be 55.