GNU Emacs is a free, portable, extensible text editor. That it is free means specifically that the source code is freely copyable and redistributable. That it is portable means that it runs on many machines under many different operating systems, so that you can probably count on being able to use the same editor no matter what machine you're using. That it is extensible means that you can not only customize all aspects of its usage (from key bindings through fonts, colors, windows, mousage and menus), but you can program Emacs to do entirely new things that its designers never thought of.
Because of all this, Emacs is an extremely successful program, and does more for you than any other editor. It's particularly good for programmers. If you use a common programming language, Emacs probably provides a mode that makes it especially easy to edit code in that language, providing context sensitive indentation and layout. It also probably allows you to compile your programs inside Emacs, with links from error messages to source code; debug your programs inside Emacs, with links to the source; interact directly with the language interpretor (where appropriate); manage change logs; jump directly to a location in the source by symbol (function or variable name); and interact with your revision control system.
Emacs also provides mail readers, news readers, World Wide Web, gopher, and FTP clients, spell checking, and a Rogerian therapist, all of which are also useful for programming. But in this document we'll concentrate on the basics of Emacs usage for programmers.
Actually, this is about the only thing I can think of that GNU Emacs is not.
Emacs is actually the name of a family of text editors that are either descended from or inspired by one another. The original Emacs was written in the programming language of the text editor TECO, and ran on DEC PDP-10s and -11's. Early on it inspired other Emacsen for Multics, and Lisp Machines.
The first Emacs for Unix machines was Gosling Emacs, which later went commercial as Unipress Emacs. GNU Emacs was written by Richard Stallman, the main author of the original TECO Emacs.
For an editor to be called "emacs" the main requirement is that it be fully extensible with a real programming language, not just a macro language. For GNU Emacs, this language is Lisp. Other Emacsen have used TECO, Scheme, a dialect of Trac called Mint, interpreted C-like languages, etc.
GNU Emacs itself runs on a large number of Unix machines, and under
VMS, DOS/Windows, and OS/2, among others. GNU Emacs is currently at
version 19.29; at Library Systems we have this version installed. On
the AIT machines (quads, ellis, kimbark and woodlawn) the default
emacs is version 18.57: a major revision behind. Version 18 is
probably 92% compatible with version 19; 19 mostly adds new
functionality. But some important commands were changed slightly.
(However, an unsupported version of Emacs v19 is available as
/usr/unsupported/gnu/bin/emacs19
.)
For example, the typical PC keyboard has keys labelled PAGE UP and HOME, arrow keys, function keys, etc. Not only do these keys have no official ASCII values, but they generate no ASCII characters at all. The PC can distinguish between them by means of scan codes, but what happens when you're connected to a Unix machine via a telecommunications program and you type one of these keys?
Any of the following things can happen, depending on the telecommunications program:
The characters generated by different keys can vary even within the same telecommunications program depending on what terminal emulation is chosen. In fact, the telecommunications program can even change the characters sent by keys that do have official ASCII characters associated with them (control-@ is a common problem).
Finally, the operating system and even terminal concentrators can muddy things further. Emacs (by default only) expects that the full ASCII character set is available. But the OS or some mux may be usurping some of your characters. The best examples are control-S and control-Q, which are sometimes used for flow control. These are important Emacs commands and you should make sure that neither your telecommunications program nor your terminal concentrator are grabbing them (Emacs can handle the OS). Other examples are nulls, sometimes swallowed up whole, and characters with the high-order bit set, sometimes stripped, or -- worse -- used for parity. Any internationalized system (like modern Unixes) work best with 8-bit no parity data channels anyway.
There's nothing Emacs can do about any of the above: it only sees 8-bit ASCII characters given to it by the operating system. But if you understand your telecommunications program, you won't have any problems with Emacs and keystrokes.
Note: Emacs actually can use special keys like the arrow keys under certain circumstances (like under X or when running natively under DOS, where Emacs understands keyboard events). But I recommend not using these keys even if they work, so that you can use Emacs from any terminal.
Every command has a long name, which you can look up in the
documentation, like kill-line
,
delete-backward-char
, or
self-insert-command
. These commands are bound
to keystrokes for convenient editing. We call such a pairing of
keystroke and command a
key binding, or binding for short.
The set of all bindings make up the Emacs command set. However, Emacs is an extensible, customizable editor. This means that:
In this document I describe the standard Emacs key bindings.
self-insert-command
so that they insert
themselves as text when typed. For editing commands, Emacs uses all
the control characters:
C-a, C-b, etc. But this is only 32 more
characters, and Emacs has more than 32 editing commands.
The 128 characters in the upper half of ASCII are not taken yet, but how do you type them? Emacs uses a Meta key, which works exactly like a Control key or a Shift key in that it generates no character by itself, but rather modifies another character on the keyboard. The Meta key actually generates the same character that the key it's used with generates, but with the high-order bit set. This gives us access to characters such as M-a, M-b, etc. (There's also M-A, which is a distinct character, but to minimize confusion the uppercase metacharacters are equated to the corresponding lowercase metacharacter.)
What about the control characters with the high-order bit set? These are valid metacharacters as well; they are notated C-M-a, etc. To key them you hold down Control and Meta simultaneously and strike the desired key. Because both Control and Meta are shift keys, C-M-a is the same key (and the same ASCII character) as M-C-a. But for consistency we always write the former.
The standard prefix commands are:
These prefixes give us another 768 keystrokes, for a total 928. But Emacs has far more than 928 commands! To handle this, you can bind one of the subcommands of a prefix command to another prefix command, like C-x 4 for example, or C-x v, each such binding yielding another 256 keystrokes. A number of these two-character prefixes exist, but they're rather specialized, and don't contain a full set of 256 commands (usually there are only three or four, and the prefix is just used for a mnemonic grouping). There are even three character prefixes, but most people won't admit to using them.
But now we're entering a sort of rarefied atmosphere: even an Emacs geek like myself doesn't really use all these key bindings. Some Emacs commands are used very rarely, and, when you need it, it's easier to look up the long name of the command (using Info, Emacs' online searchable help system) and type it directly.
There's one Emacs command that can be used to execute any other command by typing it's long name: M-x. When you type M-x Emacs prompts you for the name of any command, and then executes it.
Not all keyboards provide a Meta key that sets the high order bit. On a PC running Emacs natively, the ALT key is used for Meta. But when using a PC to talk to a Unix box via some telecommunications program -- well, you guessed it -- the ALT key may not work for this.
But if you have no Meta key, all is not lost. You just use the ESC prefix. M-a becomes ESC a; C-M-f becomes ESC C-f (remember the equivalence of C-M-f and M-C-f and this will make sense).
There's only one trick: ESC is not a shift key. It's actually an ASCII character, not a key modifier. This means that you don't try to hold down ESC at the same time as the other key: use it as a prefix character and type it separately and distinctly. If you lean on it it's likely to autorepeat (like any other key) and you'll get very confused.
A true Meta is a wonderful thing for Emacs (it makes typing much faster), but I used ESC for years with no trouble.
How does anyone remember all these commands? Simple: you don't. Every Emacs user knows a different set of commands. I've used Emacs for 15 years (starting with the original TECO Emacs), and I learn useful new Emacs commands all the time. Often I notice another Emacs user doing something and I have no idea how they've done it, so I ask and learn some Emacs command that I just never came across, or never developed as a habit.
Some Emacs users just learn the basics and are completely happy. Most users learn the basics and then some advanced commands that suit their needs. Some users are constantly learning new commands to speed their editing. A few users progress to writing their own totally new Emacs commands.
find-file
. This is the main command used to read a
file into a buffer for editing. It's actually rather subtle. When you
execute this command, it prompts you for the name of the file (with
completion). Then it checks to see if
you're already editing that file in some buffer; if you are, it simply
switches to that buffer and doesn't actually read in the file from
disk again. If you're not, a new buffer is created, named for the
file, and initialized with a copy of the file. In either case the
current window is switched to view this buffer.
save-buffer
. This is the main command used to save a
file, or, more accurately, to write a copy of the current buffer out
to the disk, overwriting the buffer's file, and handling backup
versions.
save-some-buffers
. Allows you to save all your
buffers that are visiting files, querying you for each one and
offering several options for each (save it, don't save it, peek at it
first then maybe save it, etc).
switch-to-buffer
. Prompts for a buffer name and
switches the buffer of the current window to that buffer. Doesn't
change your window configuration. This command will also create a new
empty buffer if you type a new name; this new buffer will not be visiting
any file, no matter what you name it.
list-buffers
. Pops up a new window which lists all
your buffers, giving for each the name, modified or not, size in
bytes, major mode and the file the buffer is visiting.
kill-buffer
. Prompts for a buffer name and
removes the entire data structure for that buffer from Emacs. If the
buffer is modified you'll be given an opportunity to save it. Note
that this in no way removes or deletes the associated file, if any.
vc-toggle-read-only
. Make a buffer read-only (so
that attempts to modify it are treated as errors), or make it
read-write if it was read-only. Also, if the files is under version
control, it will check the file out for you.
scroll-up
. The basic command to scroll forward
(towards the end of the file) one screenful. By default Emacs leaves
you two lines of context from the previous screen.
scroll-down
. Just like C-v, but scrolls
backwards.
other-window
. Switch to another window, making it
the active window. Repeated invocation of this command moves through
all the windows, left to right and top to bottom, and then circles
around again. Under a windowing system, you can use the left mouse
button to switch windows.
delete-other-windows
. Deletes all other windows
except the current one, making one window on the screen. Note that
this in no way deletes the buffers or files associated with the
deleted windows.
delete-window
. Deletes just the current window,
resizing the others appropriately.
split-window-vertically
. Splits the current window
in two, vertically. This creates a new window, but not a new
buffer: the same buffer will now be viewed in the two windows. This
allows you to view two different parts of the same buffer
simultaneously.
split-window-horizontally
. Splits the current window
in two, horizontally. This creates a new window, but not a new
buffer: the same buffer will now be viewed in the two windows. This
allows you to view two different parts of the same buffer
simultaneously.
scroll-other-window
. Just like C-v, but
scrolls the other window. If you have more than two windows,
the other window is the window that C-o would switch to.
emacswhen it comes up, you won't be editing any file. You can then use the file commands to read in files for editing. Alternatively, you can fire up Emacs with an initial file (or files) by saying:
emacs foo.tcl
To exit Emacs,
use the command C-x C-c (which is bound to
save-buffers-kill-emacs
). It will offer to save all your
buffers and then exit.
You can also suspend Emacs (in the Unix sense of stopping it and
putting it in the background) with C-x C-z (which is bound
to suspend-emacs
). How you restart it is up to your
shell, but is probably based on the fg
command.
self-insert-command
.
The minibuffer is also known as the echo area, because Emacs echoes keystrokes here if you're typing really slowly. To see this, type any multi-character keystroke (like, ESC q) with a long pause between the keystrokes.
It may seem annoying to have to hit return at the end of long lines, but this is actually just the default for certain modes. The reason for this is that Emacs is a programmer's editor, and any editor that will insert line breaks without your telling it to isn't safe for editing code or data. In modes oriented towards text, Emacs does insert line breaks for you automatically.
When this happens, you just need to type
C-g
(which is bound to keyboard-quit
). This is the ASCII BEL
character, and so C-g is sort of mnemonic for ringing the
bell, which is what it does. But it also does something very
important: it interrupts what Emacs is doing. This will get you out
of any questions that Emacs may be asking you, and it will abort a
partially typed key sequence (say if you typed C-x by mistake).
Because Emacs is fully recursive, you may occasionally need to type C-g more than once, to back out of a recursive sequence of commands. Also, if Emacs is really wedged (say, in a network connection to some machine which is down), typing three C-g's quickly is guaranteed to abort whatever's wedging you.
command-apropos
. Prompts for a keyword and then lists
all the commands with that keyword in their long name.
describe-key
. Prompts for a keystroke and describes
the command bound to that key, if any.
info
. Enters the Info hypertext documentation reader.
describe-mode
. Describes the current major mode and
its particular key bindings.
finder-by-keyword
. Runs an interactive
subject-oriented browser of Emacs packages.
help-with-tutorial
. Run the Emacs tutorial. This is
very helpful for beginners.
undo
,
invoked with C-_ (control underbar). C-_ is a
valid ASCII character, but some keyboards don't generate it, so you
can also use C-x u -- but it's more awkward to type, since
it's a two-character command.
The undo command allows you to undo your editing, back in time. It's handy when you accidentally convert all of a huge file to uppercase, say, or delete a huge amount of text. One keystroke changes everything back to normal.
We say Emacs has infinite undo because, unlike some editors, you can undo a long chain of commands, not just one previous one, even undoing through saves. We say Emacs has redo because you can reverse direction while undoing, thereby undoing the undo.
Once you get used to this feature you'll laugh at any editor that doesn't have it (unless you're forced to use it...). It's very important to get comfortable with undo as soon as possible; I recommend reading the undo section of the manual carefully and practising.
There's one kind of argument that's so commonly accepted that there's a
special way to provide it: numeric arguments. Many
commands will interpret a numeric argument as a request to repeat that
many times. For example, the delete-char
command (bound
to C-d), which normally deletes one character to the right
of the cursor, will delete N characters if given a numeric
argument of N. It works with self-inserting commands too: try
giving a numeric argument to a printing character, like a hyphen.
To give a command a numeric argument of, say, 12, type C-u 12 before typing the command. If you type slowly, you'll see:
C-u 1 2-in the echo area. Then type C-d and you'll have given
delete-char
an argument of 12. You can type any number
of digits after C-u. A leading hyphen makes a negative
argument; a lone hyphen is the same as an argument of -1. If you
begin typing a numeric argument and change your mind, you can of
course type C-g to abort it.
Since one often isn't interested in precisely how many times a
command is repeated, there's a shorthand way to get numeric arguments
of varying magnitudes. C-u by itself, without any
subsequent digits, is equal to a numeric argument of 4. Another
C-u multiplies that by 4 more, giving a numeric argument of
16. Another C-u multiplies that by 4 more, giving a
numeric argument of 64, etc. For this reason C-u is called
the universal-argument
.
Note that commands aren't required to interpret numeric arguments as specifying repetitions. It depends on what's appropriate: some commands ignore numeric arguments, some interpret them as Boolean (the presence of numeric argument -- any numeric argument -- as opposed to its absence), etc. Read the documentation for a command before trying it.
quoted-insert
,
which is bound to C-q. C-q acts like a prefix
command, in that when you type it it waits for you to type another
character. But this next character is then inserted into the buffer,
rather than being executed as a command. So C-q ESC
inserts an Escape.
C-q can also be used to insert characters by typing C-q followed by their ASCII code as three octal digits.
All these motion commands take numeric arguments as repetitions.
The most basic textual object is the character. Emacs understand many other objects, sometimes depending on what mode you're in (a C function textual object probably doesn't make much sense if you're not editing C source code).
The exact definition of what makes up a given textual object is often customizable, but more importantly varies slightly from mode to mode. The characters that make up a word in Text Mode may not be exactly the same as those that make up a word in C Mode for example. (E.g., underbars are considered word constituents in C Mode, because they are legal in identifier names, but they aren't considered word constituents in Text Mode.) This is extremely useful, because it means that you can use the same motion commands and yet have them automatically customized for different types of text.
forward-char
. Moves forward (to the right) over a
character.
backward-char
. Moves backward (to the left) over a
character.
forward-word
. Moves forward over a word.
backward-word
. Moves backward over a word.
next-line
. Moves down to the next line.
previous-line
. Moves up to the previous line.
beginning-of-line
. Moves to the beginning of the
current line.
end-of-line
. Moves to the end of the current line.
backward-sentence
. Moves to the beginning of the
current sentence.
forward-sentence
. Moves to the end of the
current sentence.
backward-paragraph
. Move to the beginning of the
current paragraph.
forward-paragraph
. Move to the end of the
current paragraph.
backward-page
. Moves to the beginning of the current
page.
forward-page
. Moves to the end of the current
page.
beginning-of-buffer
. Moves to the beginning of the
buffer.
end-of-buffer
. Moves to the end of the
buffer.
But sexps are more than just balanced parens: they're defined recursively. A word that doesn't contain any parens also counts as a sexp. In most programming language modes, quoted strings are sexps (using either single or double quotes, depending on the syntax of the language). The sexp commands move in terms of all these units.
These commands may seem confusing at first, but for editing most programming languages they're fantastic. Not only do they move you around quickly and accurately, but they help spot syntax errors while you're editing, because they'll generate an error if your parens or quotes are unbalanced.
backward-sexp
. Moves backward over the next sexp.
If your cursor is just to the right of a left paren,
C-M-b will beep, because there's no sexp to the left to
move over: you have to move up.
forward-sexp
. Moves forward over the next sexp.
Same deal if your cursor is just to the left of a right paren.
backward-up-list
. Move backward up one level of
parens. In other words, move to the left paren of the parens
containing the cursor, skipping balanced sexps.
down-list
. Move down one level of
parens. In other words, move to the right of the next left paren,
skipping balanced sexps. E.g., if your cursor is sitting on the return
type of a C function declaration, C-M-d moves to the inside
of the formal parameter list.
beginning-of-defun
. Move to the beginning of the
current defun.
end-of-defun
. Move to the end of the current defun.
Killed text is saved on the kill ring. The kill ring holds the last N kills, where N is 30 by default, but you can change it to anything you like by changing the value of the variable kill-ring-max. The kill ring acts like a fifo when you're killing things (after the 30th kill, kill number one is gone), but like a ring when you're yanking things back (you can yank around the ring circularly). kill-ring-max doesn't apply to the amount of text (in bytes) that can be saved in the kill ring (there's no limit), only to the number of distinct kills.
delete-char
. Deletes the character to the right of
(under, if the cursor is a block that covers a character) the cursor.
delete-backward-char
. Deletes the character to the left of
the cursor.
kill-word
. Kills to the end of the word to the right
of the cursor (forward).
backward-kill-word
. Kills to the beginning of the
word to the left of the cursor (backward).
kill-line
. Kills to the end of the current line, not
including the newline. Thus, if you're at the beginning of a line it
takes two C-k's to kill the whole line and close up the
whitespace.
kill-line
. Kills to the beginning of the
current line, not including the newline.
kill-sentence
. Kills to the end of the current
sentence, including any newline within the sentence.
kill-sentence
. Kills to the beginning of the current
sentence, including any newlines within the sentence.
forward-kill-paragraph
and
backward-kill-paragraph
exist, but are not bound to any
keys by default.
kill-buffer
doesn't kill all the text in the
buffer, but rather the entire buffer data structure; see The Mark and the Region.
kill-sexp
. Kills the sexp after the cursor.
kill-sexp
. Kills the sexp before the cursor.
backward-kill-sexp
exists, but is not bound
to any key by default.
yank
). Since Emacs has only one kill ring (as
opposed to one per buffer), you can kill in one buffer, switch to
another and yank the text there.
To get back previous kills, you move around the kill ring. Start with C-y to get the most recent kill, and then use M-y to move to the previous spot in the kill ring by replacing the just-yanked text with the previous kill. Subsequent M-y's move around the ring, each time replacing the yanked text. When you reach the text you you're interested in, just stop. Any other command (a motion command, self-insert, anything) breaks the cycling of the kill ring, and the next C-y yanks the most recent kill again.
isearch-forward
,
bound to C-s, does: it searches incrementally, one
character at a time, as you type the search string. This means that
Emacs can often find what you're looking for before you have to type
the whole thing. To stop searching, you can either hit RET
or type any other Emacs command (which will both stop the search and
execute the command). You can search for the next match at any point
by typing another C-s at any point; you can reverse the
search by typing C-r; and you can use DEL to
delete and change what you're searching for.
isearch-backward
, bound to C-r, works the
same way, but searches backward. (Use C-r to search for
the next match and C-s to reverse the search.)
Occasionally you may want to search non-incrementally (though I rarely do). You can do this by typing C-s RET text RET, where text is the text to search for.
Much more useful is word search, which lets you search for a sequence of one or more words, regardless of how they're separated (e.g, by any number and combination of newlines and whitespace). To invoke word search, type C-s RET C-w word word word RET.
Emacs can also search incrementally (or not) by regular expressions; this is an extremely powerful feature, but too complex to describe here.
query-replace
(bound to M-%). This command
prompts you for the text to replace, and the text to replace it with,
and then searches and replaces within the current buffer.
query-replace
is interactive: at each match, you are
prompted to decide what to do; you have the following options:
query-replace
without performing this
replacement.
query-replace
.
There are also more replacement commands you
should look into, including replace-string
(simple
unconditional replacement), replace-regexp
and
query-replace-regexp
(which use regular expressions), and
tags-query-replace
, which replaces all identifiers in a
collection of source code files.
query-replace
and the other replacement commands are, by
default, smart about case. For example, if you're replacing
foo with bar and find Foo,
Emacs replaces it with Bar; if you find FOO,
Emacs replaces it with BAR, etc.
The region is the text between point and mark.
Point is actually the Emacs term for what we've been calling the
cursor up to now. The mark, on the other hand, is set with a special
command C-@ (set-mark-command
). This sets the
mark exactly where point is, but now you can move point elsewhere and
you have: the region.
Each buffer has a distinct point and mark, and therefore a distinct region. (It's also possible for there to be no mark in a buffer, and therefore no region.)
The region is the same regardless of whether point comes first in the buffer or mark does; it makes no difference, just do what's convenient.
The region is normally invisible (but see C-x C-x). You'll get used to this. However, if you're running Emacs under a windowing system, you can make the region visible by executing M-x transient-mark-mode.
Many commands that move point a significant distance (like M-< and C-s, for example) leave the mark set at the spot they moved from. You'll see "Mark set" in the echo area when this happens.
When using Emacs under a windowing system like X, the mouse can be used to sweep out the region, but many Emacsers find it faster to keep their hands on the keyboard and use the familiar motion commands.
There are some special commands that are specifically designed to set the region around some interesting text.
mark-word
. Sets the region around the next word, or
from point to the end of the current word, if you're in the middle of one.
mark-paragraph
. Sets the region around the current
paragraph.
mark-sexp
. Sets the region around the same sexp that
C-M-f would move to.
mark-defun
. Sets the region around the current defun.
mark-page
. Sets the region around the current page.
mark-whole-buffer
. Sets the region around the entire
buffer.
So now you know how to define the region: what can you do with it?
exchange-point-and-mark
. Swaps mark and point.
Repeated rapid execution of this command makes it easy to see the
extent of the region.
kill-region
. Kills the region. It goes on the kill
ring, of course.
kill-ring-save
. Saves the region to the kill ring
without removing it from the buffer. This is exactly equivalent to
typing C-w C-y.
indent-rigidly
. Rigidly indents the region by as many
characters (columns) as you provide as a numeric argument (default is 1 column).
downcase-region
. Convert the entire region to
lowercase. This command is disabled by default.
upcase-region
. Convert the entire region to
uppercase. This command is disabled by default.
fill-region
. Fills, i.e., justifies with a ragged
right margin, all the paragraphs within the region.