File Type Definition

On the Definition tab in the file type configuration, you can indicate how the file type should be identified and which files it applies to.

The description is used whenever you need to make a choice between file types, such as the “files of type” list in open and save windows or the Options|File Type command. You cannot change the description of the first few file types. These file types have a special meaning in EditPad. You cannot delete these file types or change their position in the list, but you can adjust all the settings to your own tastes and habits as you can do for all the other file types.

The file masks you assign to a file type are particularly important. When you open a file, EditPad compares the file’s name against the file masks for each file type. If it finds a file type with a mask that matches the file name, that file type’s settings will be used for the file being opened. If more than one file type’s mask matches, the bottommost matching file type is used. Therefore, file types with more specific masks should be placed below file types with more general masks. The file type named “unspecified file type” has a mask of “*.*” that matches all file names. So “unspecified file type” needs to be at the top of the list so that it will be the last file type that EditPad checks when looking for a matching file type.

In a file mask, the asterisk (*) represents any number (including none) of any character. The question mark (?) represents a single character. On the Windows platform, the type of a file is usually determined by its name’s extension. The extension consists of a dot followed by (usually) three letters. E.g. text files have a .txt extension. The file mask *.txt will match all text files with a .txt extension. You can delimit multiple file masks with semicolons. The file mask *.c;*.h matches all C source and header files (with a .c and .h extension respectively). If you specify a semicolon-delimited list of file masks, a file’s name needs to match only one of them for the file type to be applied to that file.

File masks also support a simple character class notation, which matches one character from a list or a range of characters. E.g. a file mask such as www.200509[0123][0-9].log or www.200509??.log could be used to match all web logs from September 2005.

Sometimes, a file’s type cannot be easily derived from its name. While file name extensions are common on the Windows platform, they’re not on other platforms like UNIX. For such file types, you can specify a magic value regular expression. A magic value is simply some text or data at the start of a file that reveals the file’s type. A regular expression is a pattern for matching text.

When EditPad has finished comparing the file’s name against the file masks of all file types, and the only matching file type is “unspecified file type”, EditPad will try to match the magic value regular expression of each file type at the start of the file. The regex is only attempted at the very start of the file, as if the regular expression started with the anchor \A. Again, should more than one file type have a matching regular expression, the bottommost file type will be used.

Out of the box, EditPad Pro ships with several file types with magic value regular expressions. The “HTML” file type uses (?i)\s*<(!DOCTYPE\s+)?HTML to match a <!DOCTYPE HTML or <HTML tag at the start of the file, case insensitively. The “Perl script” file type has #![-_.\\/a-zA-Z0-9]*(?:/env )?perl which matches the “shebang” at the start of a Perl script. On the UNIX platform, Perl scripts usually don’t have an extension, but do have the shebang. On the Windows platform, the shebang is typically missing, but Perl scripts are given a .pl extension. EditPad will recognize the file either way, first trying the file masks to look for the .pl extension, and then trying the regular expression to match the shebang. The “XML” file type uses <\?xml |\s*<[a-zA-Z0-9_:]+\s++[^<>]*?xmlns\s*= to match the <?xml declaration or a root tag with the xmlns attribute. XML files often have an extension that indicates the application that saved the file, rather than the fact that it’s in XML format. E.g. PowerGREP uses the .pgf, .pga, .pgr and .pgl extensions for PowerGREP file selections, actions, results and libraries, rather than the generic .xml extension.

Check “default for new files“ to use the file type for new files created by clicking the File menu directly or by pressing its keyboard shortcut Ctrl+N.

Check “show in file type selection lists” if you want the file type to appear in lists where you can select a file type. That includes the file type drop-down lists in all file dialogs such as those used by File|Open and File|Save As. It also includes the file type submenus of File|New and Options|File Type. You should check this option for the file types that you usually work with, but not for others. This way the file type selection lists remain uncluttered. The file types that you hide remain fully functional. Their file masks and magic regexes are still used when detecting the file types of files you open. You can still save files of this type by manually entering their extension in the Save As dialog.