What is the purpose of `text=auto` in `.gitattributes` file?

Fizer Khan picture Fizer Khan · Jan 31, 2014 · Viewed 41.8k times · Source

Mostly .gitattributes file has * text=auto. What is the purpose of text=auto in that file?

Answer

Flimm picture Flimm · Jun 24, 2016

From the docs:

Each line in .gitattributes (or .git/info/attributes) file is of form:

pattern attr1 attr2 ...

So here, the pattern is *, which means all files, and the attribute is text=auto.

What does text=auto do? From the documentation:

When text is set to "auto", the path is marked for automatic end-of-line normalization. If Git decides that the content is text, its line endings are normalized to LF on checkin.

What's the default behaviour if it's not enabled?

Unspecified

If the text attribute is unspecified, Git uses the core.autocrlf configuration variable to determine if the file should be converted.

What does core.autocrlf do? From the docs:

   core.autocrlf

Setting this variable to "true" is almost the same as setting the text attribute to "auto" on all files except that text files are not guaranteed to be normalized: files that contain CRLF in the repository will not be touched. Use this setting if you want to have CRLF line endings in your working directory even though the repository does not have normalized line endings. This variable can be set to input, in which case no output conversion is performed.

If you think this all as clear as mud, you're not alone.

Here's what * text=auto does in my words: when someone commits a file, Git guesses whether that file is a text file or not, and if it is, it will commit a version of the file where all CR + LF bytes are replaced with LF bytes. It doesn't directly affect what files look like in the working tree, there are other settings that will convert LF bytes to CR + LF bytes when checking out a file.

Recommendation:

I would not recommend putting * text=auto in the .gitattributes file. Instead, I would recommend something like this:

*.txt text
*.html text
*.css text
*.js text

This explicitly designates which files are text files, which get CRLF converted to LF in the object database (but not necessarily in the working tree). We had a repo with * text=auto, and Git guessed wrong for an image file that it was a text file, causing it to corrupt it as it replaced CR + LF bytes with LF bytes in the object database. That was not a fun one to debug.

If you must use * text=auto, put it as the first line in .gitattributes, so that the later lines can override it. This seems to be becoming an increasingly popular practise.