[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The idea behind `spam.el' is to have a control center for spam detection and filtering in Gnus. To that end, `spam.el' does two things: it filters incoming mail, and it analyzes mail known to be spam or ham. Ham is the name used throughout `spam.el' to indicate non-spam messages.
So, what happens when you load `spam.el'?
First of all, you must set the variable
spam-install-hooks
to t
and install the spam.el
hooks:
(setq spam-install-hooks t) (spam-install-hooks-function) |
This is automatically done for you if you load spam.el
after one of the spam-use-*
variables explained later
are set. So you should load spam.el
after you set one of the
spam-use-*
variables:
(setq spam-use-bogofilter t) (require 'spam) |
You get the following keyboard commands:
gnus-summary-mark-as-spam
.
Mark current article as spam, showing it with the `$' mark. Whenever you see a spam article, make sure to mark its summary line with M-d before leaving the group. This is done automatically for unread articles in spam groups.
spam-bogofilter-score
.
You must have Bogofilter installed for that command to work properly.
See section 8.18.5.7 Bogofilter.
Also, when you load `spam.el', you will be able to customize its
variables. Try customize-group
on the `spam' variable
group.
The concepts of ham processors and spam processors are very important.
Ham processors and spam processors for a group can be set with the
spam-process
group parameter, or the
gnus-spam-process-newsgroups
variable. Ham processors take
mail known to be non-spam (ham) and process it in some way so
that later similar mail will also be considered non-spam. Spam
processors take mail known to be spam and process it so similar spam
will be detected later.
Gnus learns from the spam you get. You have to collect your spam in
one or more spam groups, and set or customize the variable
spam-junk-mailgroups
as appropriate. You can also declare
groups to contain spam by setting their group parameter
spam-contents
to gnus-group-spam-classification-spam
, or
by customizing the corresponding variable
gnus-spam-newsgroup-contents
. The spam-contents
group
parameter and the gnus-spam-newsgroup-contents
variable can
also be used to declare groups as ham groups if you set their
classification to gnus-group-spam-classification-ham
. If
groups are not classified by means of spam-junk-mailgroups
,
spam-contents
, or gnus-spam-newsgroup-contents
, they are
considered unclassified. All groups are unclassified by
default.
In spam groups, all messages are considered to be spam by default:
they get the `$' mark (gnus-spam-mark
) when you enter the
group. If you have seen a message, had it marked as spam, then
unmarked it, it won't be marked as spam when you enter the group
thereafter. You can disable that behavior, so all unread messages
will get the `$' mark, if you set the
spam-mark-only-unseen-as-spam
parameter to nil
. You
should remove the `$' mark when you are in the group summary
buffer for every message that is not spam after all. To remove the
`$' mark, you can use M-u to "unread" the article, or
d for declaring it read the non-spam way. When you leave a
group, all spam-marked (`$') articles are sent to a spam
processor which will study them as spam samples.
Messages may also be deleted in various other ways, and unless
ham-marks
group parameter gets overridden below, marks `R'
and `r' for default read or explicit delete, marks `X' and
`K' for automatic or explicit kills, as well as mark `Y' for
low scores, are all considered to be associated with articles which
are not spam. This assumption might be false, in particular if you
use kill files or score files as means for detecting genuine spam, you
should then adjust the ham-marks
group parameter.
When you leave any group, regardless of its
spam-contents
classification, all spam-marked articles are sent
to a spam processor, which will study these as spam samples. If you
explicit kill a lot, you might sometimes end up with articles marked
`K' which you never saw, and which might accidentally contain
spam. Best is to make sure that real spam is marked with `$',
and nothing else.
When you leave a spam group, all spam-marked articles are
marked as expired after processing with the spam processor. This is
not done for unclassified or ham groups. Also, any
ham articles in a spam group will be moved to a location
determined by either the ham-process-destination
group
parameter or a match in the gnus-ham-process-destinations
variable, which is a list of regular expressions matched with group
names (it's easiest to customize this variable with
customize-variable gnus-ham-process-destinations
). The ultimate
location is a group name. If the ham-process-destination
parameter is not set, ham articles are left in place. If the
spam-mark-ham-unread-before-move-from-spam-group
parameter is
set, the ham articles are marked as unread before being moved.
When you leave a ham group, all ham-marked articles are sent to a ham processor, which will study these as non-spam samples.
By default the variable spam-process-ham-in-spam-groups
is
nil
. Set it to t
if you want ham found in spam groups
to be processed. Normally this is not done, you are expected instead
to send your ham to a ham group and process it there.
By default the variable spam-process-ham-in-nonham-groups
is
nil
. Set it to t
if you want ham found in non-ham (spam
or unclassified) groups to be processed. Normally this is not done,
you are expected instead to send your ham to a ham group and process
it there.
When you leave a ham or unclassified group, all
spam articles are moved to a location determined by either
the spam-process-destination
group parameter or a match in the
gnus-spam-process-destinations
variable, which is a list of
regular expressions matched with group names (it's easiest to
customize this variable with customize-variable
gnus-spam-process-destinations
). The ultimate location is a group
name. If the spam-process-destination
parameter is not set,
the spam articles are only expired.
To use the `spam.el' facilities for incoming mail filtering, you
must add the following to your fancy split list
nnmail-split-fancy
or nnimap-split-fancy
:
(: spam-split) |
Note that the fancy split may be called nnmail-split-fancy
or
nnimap-split-fancy
, depending on whether you use the nnmail or
nnimap back ends to retrieve your mail.
The spam-split
function will process incoming mail and send the
mail considered to be spam into the group name given by the variable
spam-split-group
. By default that group name is `spam',
but you can customize spam-split-group
.
You can also give spam-split
a parameter,
e.g. `'spam-use-regex-headers'. Why is this useful?
Take these split rules (with spam-use-regex-headers
and
spam-use-blackholes
set):
nnimap-split-fancy '(| (any "ding" "ding") (: spam-split) ;; default mailbox "mail") |
Now, the problem is that you want all ding messages to make it to the
ding folder. But that will let obvious spam (for example, spam
detected by SpamAssassin, and spam-use-regex-headers
) through,
when it's sent to the ding list. On the other hand, some messages to
the ding list are from a mail server in the blackhole list, so the
invocation of spam-split
can't be before the ding rule.
You can let SpamAssassin headers supersede ding rules, but all other
spam-split
rules (including a second invocation of the
regex-headers check) will be after the ding rule:
nnimap-split-fancy '(| (: spam-split 'spam-use-regex-headers) (any "ding" "ding") (: spam-split) ;; default mailbox "mail") |
Basically, this lets you invoke specific spam-split
checks
depending on your particular needs. You don't have to throw all mail
into all the spam tests. Another reason why this is nice is that
messages to mailing lists you have rules for don't have to have
resource-intensive blackhole checks performed on them. You could also
specify different spam checks for your nnmail split vs. your nnimap
split. Go crazy.
You still have to have specific checks such as
spam-use-regex-headers
set to t
, even if you specifically
invoke spam-split
with the check. The reason is that when
loading `spam.el', some conditional loading is done depending on
what spam-use-xyz
variables you have set.
Note for IMAP users
The boolean variable nnimap-split-download-body
needs to be
set, if you want to split based on the whole message instead of just
the headers. By default, the nnimap back end will only retrieve the
message headers. If you use spam-check-bogofilter
,
spam-check-ifile
, or spam-check-stat
(the splitters that
can benefit from the full message body), you should set this variable.
It is not set by default because it will slow IMAP down, and
that is not an appropriate decision to make on behalf of the user.
See section 6.5.1 Splitting in IMAP.
TODO: Currently, spam.el only supports insertion of articles into a back end. There is no way to tell spam.el that an article is no longer spam or ham.
TODO: spam.el needs to provide a uniform way of training all the statistical databases. Some have that functionality built-in, others don't.
The following are the methods you can use to control the behavior of
spam-split
and their corresponding spam and ham processors:
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |