How to replace tabs with spaces in Atom?

J0ANMM picture J0ANMM · Jan 25, 2017 · Viewed 54.4k times · Source

As I started working with Atom text editor, I used tab-indentation but want to change to 4-space-indentation.

I have several files that should be updated accordingly

What would be the easiest way to do it?

Answer

Dan Lowe picture Dan Lowe · Jan 25, 2017

Atom has a built-in tool for this

Activate the command palette (ShiftCmdP on Mac, CtrlShiftP on Windows/Linux) and search for "convert space" or "convert tab". You should find these three commands are available:

  • Whitespace: Convert Spaces to Tabs
  • Whitespace: Convert Tabs to Spaces
  • Whitespace: Convert All Tabs to Spaces

Convert Tabs vs. Convert All Tabs

In the comments you observed that using "Convert Tabs to Spaces" would break indentation in Python, but "Convert All Tabs to Spaces" worked correctly. You asked what the difference between the two is.

I didn't know the answer, so I went looking. This is defined in the "whitespace" package, the source for which can be found on Github at atom/whitespace.

Looking in lib/whitespace.js, I found this:

'whitespace:convert-tabs-to-spaces': () => {
  let editor = atom.workspace.getActiveTextEditor()

  if (editor) {
    this.convertTabsToSpaces(editor)
  }
},

'whitespace:convert-spaces-to-tabs': () => {
  let editor = atom.workspace.getActiveTextEditor()

  if (editor) {
    return this.convertSpacesToTabs(editor)
  }
},

'whitespace:convert-all-tabs-to-spaces': () => {
  let editor = atom.workspace.getActiveTextEditor()

  if (editor) {
    return this.convertTabsToSpaces(editor, true)
  }
}

As you can see, the relevant function here is convertTabsToSpaces. In the "convert all" variant, the only difference is that a second (optional) argument is passed, and set to true.

return this.convertTabsToSpaces(editor, true)

Looking at the definition of convertTabsToSpaces, the difference is that the regex is changed based on the state of this boolean argument.

convertTabsToSpaces (editor, convertAllTabs) {
  let buffer = editor.getBuffer()
  let spacesText = new Array(editor.getTabLength() + 1).join(' ')
  let regex = (convertAllTabs ? /\t/g : /^\t+/g)

  buffer.transact(function () {
    return buffer.scan(regex, function ({replace}) {
      return replace(spacesText)
    })
  })

  return editor.setSoftTabs(true)
}

The relevant line here is:

let regex = (convertAllTabs ? /\t/g : /^\t+/g)

So in the "convert all" variant, the regex does not care about being anchored to the beginning of the line (^ is not used), and each tab is is own replacement (rather than groups of tab characters being treated as a single replacement -- \t vs. \t+).

Why it broke Python indentation

I don't know what file you used, but I used a pretty simple test file like this, indented completely with tab characters.

import foo

class Foo():
    def __init__(self):
        self.foo = True

    def bar(self, a, b):
        return a + b

After using "Convert Tabs to Spaces" it looked like this:

import foo

class Foo():
    def __init__(self):
    self.foo = True

    def bar(self, a, b):
    return a + b

Woah! That's now a SyntaxError. Try again with "Convert All Tabs to Spaces":

import foo

class Foo():
    def __init__(self):
        self.foo = True

    def bar(self, a, b):
        return a + b

This happens because in the first case, groups of tabs on the left margin are, as a collection, reduced to a space-based indent. Since the regex is ^\t+, it doesn't matter if the line is indented with 1, 2, 8, 24 tabs... they are all replaced with a single indent level, but made of spaces.

Honestly I don't know what the point of that is... that seems like a bug.

In the second case, every tab is converted to the equivalent space-based width (i.e. each tab is converted into 4 spaces, even if it is adjacent to another tab character).

So that's the one you probably want.

One caveat: it is no longer limited to the left margin (there is no ^ anchor), so if you have embedded tab characters elsewhere, those will also be converted. That is not a normal practice in code in my experience, but FYI, in case it matters to you.