Mathematics | ![]() ![]() |
Removing Outliers
You can remove outliers or misplaced data points from a data set in much the same manner as NaNs
. For the vehicle traffic count data, the mean and standard deviations of each column of the data are
mu
= mean(count)sigma
= std(count) mu = 32.0000 46.5417 65.5833 sigma = 25.3703 41.4057 68.0281
The number of rows with outliers greater than three standard deviations is obtained with
[n,p] = size(count)
outliers = abs(count - mu(ones(n, 1),:)) > 3*sigma(ones(n, 1),:);
nout = sum(outliers)
nout =
1 0 0
There is one outlier in the first column. Remove this entire observation with
count(any(outliers'),:) = [];
![]() | Data Preprocessing | Regression and Curve Fitting | ![]() |