Matlab: How can I read in a string separated with spaces but ignore single spaces (using textscan)?

Birk Birk picture Birk Birk · Jun 30, 2011 · Viewed 11.9k times · Source

Hi all and thanks in advance. This is my first post here, please let me know if I should do this differently.

I have a large textfile containing lines like the following:

"DATE      TIMESTAMP    T W M     T AL M C  A_B_C"

At first I read this in using the fopen and fget1 commands, so that I get a string:

Readout = DATE      TIMESTAMP    T W M     T AL M C A_B_C

I want to transform this via e.g. textscan. While I feel I know matlab I am by no means expert with this command and have trouble using it.

I want to get:

A = 'Date' 'TIMESTAMP' 'T W M' 'T AL M C' 'A_B_C'

However using the following code:

 A = textscan(Readout,'%s');
 A = A{1}';

I get:

A = 'DATE'    'TIMESTAMP'    'T'    'W'    'M'    'T'    'AL'    'M'    'C'    'A_B_C'

As I asked in the title, is there a way to ignore the single spaces?

PS: At the end of writing this I just came up with a not very elegent solution I would still like to know if there is any nicer solution, however:

ReadBetter = [];
for n = 1:length(Read)-1
if Read(n) == ' ' & Read(n+1) ~= ' '
else
    ReadBetter = [ReadBetter Read(n)];
end
end
ReadBetter = [ReadBetter Read(n+1)];
Read   
ReadBetter

Output:
Read =

DATE      TIMESTAMP    T W M     T AL M C   A_B_C

ReadBetter =

DATE     TIMESTAMP   TWM    TALMC   A_B_C

Now I can use ReadBetter with textscan.

Thanks for this awesome webpage and the help I found here, in many other posts

Answer

Rich C picture Rich C · Jul 1, 2011

Newer versions of matlab have a 'split' option for regexp similar to perl's split.

>> str = 'DATE      TIMESTAMP    T W M     T AL M C  A_B_C';
>> out = regexp(str, '  +', 'split')

out = 

    'DATE'    'TIMESTAMP'    'T W M'    'T AL M C'    'A_B_C'