By default spamassasin is configured with the use_bayes and bayes_autolearn options.
use_bayes ( 0 | 1 ) (default: 1)
Whether to use the naive-Bayesian-style classifier built into SpamAssassin. This is a master on/off switch for all Bayes-related operations.
use_bayes_rules ( 0 | 1 ) (default: 1)
Whether to use rules using the naive-Bayesian-style classifier built into SpamAssassin. This allows you to disable the rules while leaving auto and manual learning enabled.
bayes_auto_learn ( 0 | 1 ) (default: 1)
Whether SpamAssassin should automatically feed high-scoring mails (or low-scoring mails, for non-spam) into its learning systems. The only learning system supported currently is a naive-Bayesian-style classifier.
Note that certain tests are ignored when determining whether a message should be trained upon:
- rules with tflags set to ‘learn’ (the Bayesian rules) - rules with tflags set to ‘userconf’ (user white/black-listing rules, etc) - rules with tflags set to 'noautolearn’Also note that auto-training occurs using scores from either scoreset 0 or 1, depending on what scoreset is used during message check. It is likely that the message check and auto-train scores will be different.
In this case, bayes has a database. The bayes database path is, by default “~/.spamassassin/bayes” and should have several databases like bayes_toks, bayes_seen, etc…
bayes_path /path/to/file (default: ~/.spamassassin/bayes)
Path for Bayesian probabilities databases. Several databases will be created, with this as the base, with _toks, _seen etc. appended to this filename; so the default setting results in files called ~/.spamassassin/bayes_seen, ~/.spamassassin/bayes_toks etc.
By default, each user has their own, in their ~/.spamassassin directory with mode 0700/0600, but for system-wide SpamAssassin use, you may want to reduce disk space usage by sharing this across all users. (However it should be noted that Bayesian filtering appears to be more effective with an individual database per user.)
It appears that on my box there is no Bayes database. I’ve tried to do a locate bayes and the only one existing is the ruleset config file in /usr/share/spamassassin
When I used spamassassin I had created a file /etc/sysconfig/spamassassin to start spamassassin with the option
SPAMDOPTIONS="-x -u spamd -H /home/spamd -d"
in /home/spamd there was the bayes databases
the user pref file should be in $HOME/.spamassassin/user_prefs but in our case they are stored in a MySQL database. Maybe the bayes databases should also be stored in a MySQL database ? anyway…
How did you setup spamassassin ?
Where is the bayes databases ?
Is bayes set up per user level or box level ?
How to create these databases ?