[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Adding new document types to be recognized by nndoc
isn't
difficult. You just have to whip up a definition of what the document
looks like, write a predicate function to recognize that document type,
and then hook into nndoc
.
First, here's an example document type definition:
(mmdf (article-begin . "^\^A\^A\^A\^A\n") (body-end . "^\^A\^A\^A\^A\n")) |
The definition is simply a unique name followed by a series of regexp pseudo-variable settings. Below are the possible variables--don't be daunted by the number of variables; most document types can be defined with very few settings:
first-article
nndoc
will skip past all text until it finds
something that match this regexp. All text before this will be
totally ignored.
article-begin
head-begin-function
nndoc-head-begin
nndoc-head-end
body-begin-function
body-begin
body-end-function
body-end
file-end
So, using these variables nndoc
is able to dissect a document
file into a series of articles, each with a head and a body. However, a
few more variables are needed since not all document types are all that
news-like--variables needed to transform the head or the body into
something that's palatable for Gnus:
prepare-body-function
article-transform-function
generate-head-function
Let's look at the most complicated example I can come up with--standard digests:
(standard-digest (first-article . ,(concat "^" (make-string 70 ?-) "\n\n+")) (article-begin . ,(concat "\n\n" (make-string 30 ?-) "\n\n+")) (prepare-body-function . nndoc-unquote-dashes) (body-end-function . nndoc-digest-body-end) (head-end . "^ ?$") (body-begin . "^ ?\n") (file-end . "^End of .*digest.*[0-9].*\n\\*\\*\\|^End of.*Digest *$") (subtype digest guess)) |
We see that all text before a 70-width line of dashes is ignored; all
text after a line that starts with that `^End of' is also ignored;
each article begins with a 30-width line of dashes; the line separating
the head from the body may contain a single space; and that the body is
run through nndoc-unquote-dashes
before being delivered.
To hook your own document definition into nndoc
, use the
nndoc-add-type
function. It takes two parameters--the first
is the definition itself and the second (optional) parameter says
where in the document type definition alist to put this definition.
The alist is traversed sequentially, and
nndoc-type-type-p
is called for a given type type.
So nndoc-mmdf-type-p
is called to see whether a document is of
mmdf
type, and so on. These type predicates should return
nil
if the document is not of the correct type; t
if it
is of the correct type; and a number if the document might be of the
correct type. A high number means high probability; a low number
means low probability with `0' being the lowest valid number.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |