difference between text file and binary file

andrew picture andrew · May 18, 2011 · Viewed 25.1k times · Source

Why should we distinguish between text file and binary files when transmitting them? Why there are some channels designed only for textual data? At the bottom level, they are all bits.

Answer

Dietrich Epp picture Dietrich Epp · May 18, 2011

At the bottom level, they are all bits... true. However, some transmission channels have seven bits per byte, and other transmission channels have eight bits per byte. If you transmit ASCII text over a seven-bit channel, then all is fine. Binary data gets mangled.

Additionally, different systems use different conventions for line endings: LF and CRLF are common, but some systems use CR or NEL. A text transmission mode will convert line endings automatically, which will damage binary files.

However, this is all mostly of historical interest these days. Most transmission channels are eight bit (such as HTTP) and most users are fine with whatever line ending they get.

Some examples of 7-bit channels: SMTP (nominally, without extensions), SMS, Telnet, some serial connections. The internet wasn't always built on TCP/IP, and it shows.

Additionally, the HTTP spec states that,

When in canonical form, media subtypes of the "text" type use CRLF as the text line break. HTTP relaxes this requirement and allows the transport of text media with plain CR or LF alone representing a line break when it is done consistently for an entire entity-body.