307 lines
7.9 KiB
Markdown
307 lines
7.9 KiB
Markdown
# db_log_exporter
|
||
|
||
从 MySQL / PostgreSQL 数据库表中**定时拉取日志**,写入标准 **syslog 格式**(RFC 5424 变体)文本文件。
|
||
|
||
---
|
||
|
||
## 功能特性
|
||
|
||
| 特性 | 说明 |
|
||
|------|------|
|
||
| **多数据库支持** | MySQL + PostgreSQL 同时运行,互不干扰 |
|
||
| **断点续传** | 每个数据源独立记录最后拉取的 ID,重启不丢、不重复 |
|
||
| **标准格式** | 输出 RFC 5424 syslog 行,可被 rsyslog、Filebeat、Promtail 等直接采集 |
|
||
| **多数据源** | 一个进程管理多个表,不同表可配置不同数据库、轮询间隔、输出文件 |
|
||
| **线程安全** | 每个数据源独立线程,并发拉取,互不影响 |
|
||
| **灵活列映射** | 数据库列名不同时可在配置中映射 |
|
||
| **Dry-run** | `--dry-run` 模式测试连接,不写文件 |
|
||
| **单次模式** | `--once` 模式跑完一轮即退出,适合 cron 场景 |
|
||
| **systemd 集成** | 提供 `.service` 文件,支持开机自启 |
|
||
|
||
---
|
||
|
||
## 输出格式
|
||
|
||
每行一条日志,格式如下(RFC 5424 变体):
|
||
|
||
```
|
||
<priority>1 2026-05-12T15:30:00.123456+08:00 hostname app_name[12345]: [trace=abc123] [span=def456] 日志内容
|
||
```
|
||
|
||
| 字段 | 说明 |
|
||
|------|------|
|
||
| `<priority>` | syslog 优先级,`<6>`=INFO `<4>`=WARN `<3>`=ERROR 等 |
|
||
| `version` | RFC 5424 版本号,始终为 `1` |
|
||
| `timestamp` | ISO 8601 时间(含微秒和时区) |
|
||
| `hostname` | 配置文件中的 hostname(默认系统 hostname) |
|
||
| `app_name[pid]` | 配置的 app_name + 本进程 PID |
|
||
| `[trace=...] [span=...]` | 结构化数据(trace_id / span_id 等) |
|
||
| `message` | 日志正文 |
|
||
|
||
---
|
||
|
||
## 快速开始
|
||
|
||
### 1. 安装依赖
|
||
|
||
```bash
|
||
# CentOS / RHEL / Fedora
|
||
sudo yum install -y python3 python3-pip
|
||
sudo pip3 install PyMySQL psycopg2-binary PyYAML
|
||
|
||
# Debian / Ubuntu
|
||
sudo apt-get install -y python3 python3-pip
|
||
sudo pip3 install PyMySQL psycopg2-binary PyYAML
|
||
```
|
||
|
||
### 2. 下载程序
|
||
|
||
```bash
|
||
sudo mkdir -p /opt/db_log_exporter
|
||
sudo curl -L https://your-repo/db_log_exporter.py \
|
||
-o /opt/db_log_exporter/db_log_exporter.py
|
||
sudo chmod +x /opt/db_log_exporter/db_log_exporter.py
|
||
```
|
||
|
||
### 3. 编写配置
|
||
|
||
```bash
|
||
sudo mkdir -p /etc/db_log_exporter
|
||
sudo nano /etc/db_log_exporter/config.yaml
|
||
```
|
||
|
||
参考 `config.yaml.example`(本仓库根目录),主要填:
|
||
|
||
```yaml
|
||
databases:
|
||
mysql_prod:
|
||
type: mysql
|
||
host: 192.168.1.100
|
||
port: 3306
|
||
user: log_reader
|
||
password: "your_password"
|
||
database: app_logs
|
||
|
||
sources:
|
||
- name: access_log
|
||
database: mysql_prod
|
||
table: access_log
|
||
columns:
|
||
id: id
|
||
timestamp: created_at
|
||
level: log_level
|
||
message: message
|
||
```
|
||
|
||
### 4. 测试连接(Dry-run)
|
||
|
||
```bash
|
||
python3 /opt/db_log_exporter/db_log_exporter.py \
|
||
-c /etc/db_log_exporter/config.yaml \
|
||
--dry-run
|
||
```
|
||
|
||
### 5. 运行方式
|
||
|
||
**方式 A:systemd 守护进程(推荐)**
|
||
|
||
```bash
|
||
# 复制 service 文件
|
||
sudo cp db_log_exporter.service /etc/systemd/system/
|
||
sudo systemctl daemon-reload
|
||
sudo systemctl enable --now db_log_exporter
|
||
sudo journalctl -u db_log_exporter -f # 实时查看日志
|
||
```
|
||
|
||
**方式 B:cron 定时任务(单次模式)**
|
||
|
||
```bash
|
||
# crontab -e
|
||
* * * * * python3 /opt/db_log_exporter/db_log_exporter.py \
|
||
-c /etc/db_log_exporter/config.yaml --once
|
||
```
|
||
|
||
**方式 C:直接运行(前台守护)**
|
||
|
||
```bash
|
||
python3 /opt/db_log_exporter/db_log_exporter.py \
|
||
-c /etc/db_log_exporter/config.yaml
|
||
```
|
||
|
||
---
|
||
|
||
## 配置说明
|
||
|
||
详细配置参考 `config.yaml.example`,核心字段:
|
||
|
||
```yaml
|
||
global:
|
||
output_dir: /var/log/db_exporter # 日志输出目录
|
||
checkpoint_dir: /var/lib/... # 断点文件目录
|
||
hostname: myserver # syslog hostname
|
||
interval: 30 # 默认轮询间隔(秒)
|
||
batch_size: 1000 # 默认每次最多读条数
|
||
|
||
databases:
|
||
别名:
|
||
type: mysql | postgresql
|
||
host: ...
|
||
port: ...
|
||
user: ...
|
||
password: ...
|
||
database: ... # MySQL: database 名
|
||
dbname: ... # PostgreSQL: dbname
|
||
charset: utf8mb4 # MySQL 专用
|
||
|
||
sources:
|
||
- name: 源名称 # 唯一标识,用于断点文件名
|
||
database: 引用上方别名
|
||
table: 表名
|
||
log_file: 输出文件名 # 相对于 output_dir
|
||
app_name: syslog程序名
|
||
interval: 15 # 覆盖全局间隔
|
||
batch_size: 500 # 覆盖全局批次大小
|
||
columns: # 列名映射
|
||
id: id
|
||
timestamp: created_at
|
||
level: log_level
|
||
message: message
|
||
trace_id: trace_id # 可选
|
||
span_id: span_id # 可选
|
||
```
|
||
|
||
### 数据库用户权限(最小权限)
|
||
|
||
**MySQL:**
|
||
```sql
|
||
CREATE USER 'log_reader'@'%' IDENTIFIED BY 'password';
|
||
GRANT SELECT ON app_logs.access_log TO 'log_reader'@'%';
|
||
GRANT SELECT ON app_logs.error_log TO 'log_reader'@'%';
|
||
FLUSH PRIVILEGES;
|
||
```
|
||
|
||
**PostgreSQL:**
|
||
```sql
|
||
CREATE USER log_reader WITH PASSWORD 'password';
|
||
GRANT CONNECT ON DATABASE app_logs TO log_reader;
|
||
GRANT SELECT ON application_logs TO log_reader;
|
||
GRANT SELECT ON audit_log TO log_reader;
|
||
```
|
||
|
||
---
|
||
|
||
## 与日志采集系统集成
|
||
|
||
### rsyslog(服务器接收)
|
||
|
||
```conf
|
||
# /etc/rsyslog.d/60-db-exporter.conf
|
||
module(load="imfile" PollingInterval="10")
|
||
|
||
input(type="imfile"
|
||
File="/var/log/db_exporter/mysql_access.log"
|
||
Tag="db:access:"
|
||
Facility="local0")
|
||
```
|
||
|
||
### Filebeat(采集到 Elasticsearch)
|
||
|
||
```yaml
|
||
# filebeat.yml
|
||
filebeat.inputs:
|
||
- type: log
|
||
paths:
|
||
- /var/log/db_exporter/*.log
|
||
fields:
|
||
log_type: db_exporter
|
||
fields_under_root: true
|
||
```
|
||
|
||
### Promtail(Loki 采集)
|
||
|
||
```yaml
|
||
# promtail.yml
|
||
scrape_configs:
|
||
- job_name: db_exporter
|
||
static_configs:
|
||
- targets: [localhost]
|
||
labels:
|
||
job: db_exporter
|
||
__path__: /var/log/db_exporter/*.log
|
||
```
|
||
|
||
---
|
||
|
||
## 故障排查
|
||
|
||
| 问题 | 排查方法 |
|
||
|------|---------|
|
||
| 服务启动失败 | `journalctl -u db_log_exporter -e` 查看错误日志 |
|
||
| 连接被拒绝 | 确认数据库允许该 IP 连接,检查防火墙/安全组 |
|
||
| 权限不足 | 确认运行用户对 `output_dir`、`checkpoint_dir` 有写权限 |
|
||
| 日志重复 | 删除对应断点文件并重启,程序会从头拉取 |
|
||
| 中文乱码 | 确认数据库字符集为 `utf8mb4`(MySQL)或 `UTF8`(PG) |
|
||
| 连接超时 | 在 `databases` 中加 `connect_timeout: 10` |
|
||
|
||
---
|
||
|
||
## 目录结构
|
||
|
||
```
|
||
db_log_exporter/
|
||
├── db_log_exporter.py # 主程序(Python 守护进程)
|
||
├── config.yaml.example # 配置文件示例
|
||
├── requirements.txt # Python 依赖
|
||
├── db_log_exporter.service # systemd 服务文件
|
||
├── setup.sh # 自动化安装脚本
|
||
└── README.md # 本文件
|
||
```
|
||
|
||
---
|
||
|
||
## 命令行参数
|
||
|
||
```
|
||
-c, --config YAML 配置文件路径(必填)
|
||
--once 仅执行一次轮询后退出(适合 cron)
|
||
--dry-run 仅测试数据库连接,不写文件
|
||
--log-level 日志级别: DEBUG|INFO|WARNING|ERROR(默认 INFO)
|
||
--log-file 本程序日志输出文件(默认 stdout)
|
||
```
|
||
|
||
---
|
||
|
||
## 安全建议
|
||
|
||
1. **数据库密码不要明文写在配置中**,使用环境变量或 systemd secret:
|
||
```bash
|
||
# /etc/db_log_exporter/env
|
||
DB_PASSWORD=your_password
|
||
```
|
||
然后在 `config.yaml` 中用 `${DB_PASSWORD}` 引用
|
||
|
||
2. **以最小权限用户运行服务**,不要用 root:
|
||
```bash
|
||
useradd -r -s /sbin/nologin db_exporter
|
||
chown -R db_exporter:db_exporter /var/log/db_exporter /var/lib/db_exporter
|
||
# 修改 service 文件中的 User=db_exporter
|
||
```
|
||
|
||
3. **日志文件及时轮转**,防止磁盘爆满:
|
||
```bash
|
||
# /etc/logrotate.d/db_log_exporter
|
||
/var/log/db_exporter/*.log {
|
||
daily
|
||
rotate 7
|
||
compress
|
||
delaycompress
|
||
missingok
|
||
notifempty
|
||
create 0644 root root
|
||
sharedscripts
|
||
postrotate
|
||
systemctl reload db_log_exporter > /dev/null 2>&1 || true
|
||
endscript
|
||
}
|
||
```
|