Be More Productive with Your Favorite Text Editor

Line Breaks in Windows, UNIX & Macintosh Text Files

A problem that often bites people working with different platforms, such as a PC running Windows and a web server running Linux, is the different character codes used to terminate lines in text files.

Windows, and DOS before it, uses a pair of CR and LF characters to terminate lines. UNIX (Including Linux and FreeBSD) uses an LF character only. OS X also uses a single LF character, but the classic Mac operating system used a single CR character for line breaks. In other words: a complete mess.

Problems arise when transferring text files between different operating systems and using software that is not smart enough to detect the line break style used by a file. E.g. if you open a UNIX file in Microsoft Notepad, it will display the text as if the file contained no line breaks at all. If you open a Windows file in a UNIX editor like “joe” or “vi”, you will see a control character (the CR) at the end of each line. Older versions of Perl on Linux would refuse to run any script that used Windows line breaks, aborting with an unhelpful error message.

Mix All Line Break Styles

EditPad Pro does not care which line break style a file uses. It will automatically detect the format and indicate it in the status bar. If you open a Mac file on your Windows PC, it will still be a Mac file when you save it. To change the line break format, select the Windows, UNIX or Mac option in the Convert menu.

EditPad Pro can even handle files that use inconsistent line breaks. This is indicated in the status bar as (Mixed) along with the dominant style. In such a situation, it’s best to make the line break style consistent. Very few applications can properly handle files with mixed line break styles. Simply select the line break style you want (Windows, UNIX, Mac) from EditPad Pro’s Convert menu.

Transferring Text Files Between Computers Using Different Operating Systems

A common way to transfer files between a computer and a server is FTP. All FTP software can transfer files in “ascii” or “binary” mode. In “ascii” mode, the FTP software will convert line breaks, while in “binary” mode it will not. In “ascii” mode, transferring a Windows file from a Windows PC to a Linux server results in a UNIX file on the server. Downloading the file again converts it back to Windows. This system works perfectly if you remember to turn on “ascii” mode for text files. Many FTP clients also have an “automatic” mode that switches between ascii and binary depending on the extension of the file you’re transferring.

Things go wrong when mixing “ascii” and “binary” transfers. When a webmaster uploads a Windows file to a Linux server in “binary” mode, the file has CR LF line breaks on the server. If you then download that file with your web browser on your Windows PC (which does the UNIX->Windows conversion), the browser will interpret the file on the server as a UNIX file, even though it is in Windows format. It will convert each LF into CR LF, resulting in a file that uses CR CR LF as line breaks.

If you try to open that file, you’ll get quite different results with various software. Microsoft Notepad will interpret the CR CR LF as a single line break. EditPad however, since it supports mixed line break styles, will interpret it as a double line break. First a CR (Mac style), and then a CR LF (Windows style). The file will appear double spaced in EditPad Pro.

To remove the unwanted blank lines, simply select Double->Single spaced in EditPad Pro’s Convert menu.

EditPad Pro’s built-in FTP always transfers files in binary mode. This way you will never have any surprises with the line breaks. Before uploading a file, use the Convert menu to make sure the file has the line break style that the server expects (if the server cares at all). EditPad Pro will then upload the file with that line break style. When downloading a file, you can be sure that EditPad Pro will show you the file with the line breaks it has on the server.