MATLAB: Using interpolation to replace missing values (NaN)

Question 1

MATLAB: Using interpolation to replace missing values (NaN)

matlab interpolation nan missing-data

Dave · Sep 2, 2010 · Viewed 35.2k times · Source

Answer

Answer

Well, if you're working with time-series data then you can use Matlab's built in interpolation function.

Something like this should work for your situation, but you'll need to tailor it a little ... ie. if you don't have equal spaced sampling you'll need to modify the times line.

nseq = cell(size(seq))
for i = 1:numel(seq)
    times = 1:length(seq{i});
    mask =  ~isnan(seq{i});
    nseq{i} = seq{i};
    nseq{i}(~mask) = interp1(times(mask), seq{i}(mask), times(~mask));

end

You'll need to play around with the options of interp1 to figure out which ones work best for your situation.

Question 2

I have cell array each containing a sequence of values as a row vector. The sequences contain some missing values represented by NaN.

I would like to replace all NaNs using some sort of interpolation method, how can I can do this in MATLAB? I am also open to other suggestions on how to deal with these missing values.

Consider this sample data to illustrate the problem:

seq = {randn(1,10); randn(1,7); randn(1,8)};
for i=1:numel(seq)
    %# simulate some missing values
    ind = rand( size(seq{i}) ) < 0.2;
    seq{i}(ind) = nan;
end

The resulting sequences:

seq{1}
ans =
     -0.50782     -0.32058          NaN      -3.0292     -0.45701       1.2424          NaN      0.93373          NaN    -0.029006
seq{2}
ans =
      0.18245      -1.5651    -0.084539       1.6039     0.098348     0.041374     -0.73417
seq{3}
ans =
          NaN          NaN      0.42639     -0.37281     -0.23645       2.0237      -2.2584       2.2294

Edit:

Based on the responses, I think there's been a confusion: obviously I'm not working with random data, the code shown above is simply an example of how the data is structured.

The actual data is some form of processed signals. The problem is that during the analysis, my solution would fail if the sequences contain missing values, hence the need for filtering/interpolation (I already considered using the mean of each sequence to fill the blanks, but I am hoping for something more powerful)

MATLAB: Using interpolation to replace missing values (NaN)

Answer

Related questions