initial: db_log_exporter v1.0
This commit is contained in:
306
README.md
Normal file
306
README.md
Normal file
@@ -0,0 +1,306 @@
|
||||
# db_log_exporter
|
||||
|
||||
从 MySQL / PostgreSQL 数据库表中**定时拉取日志**,写入标准 **syslog 格式**(RFC 5424 变体)文本文件。
|
||||
|
||||
---
|
||||
|
||||
## 功能特性
|
||||
|
||||
| 特性 | 说明 |
|
||||
|------|------|
|
||||
| **多数据库支持** | MySQL + PostgreSQL 同时运行,互不干扰 |
|
||||
| **断点续传** | 每个数据源独立记录最后拉取的 ID,重启不丢、不重复 |
|
||||
| **标准格式** | 输出 RFC 5424 syslog 行,可被 rsyslog、Filebeat、Promtail 等直接采集 |
|
||||
| **多数据源** | 一个进程管理多个表,不同表可配置不同数据库、轮询间隔、输出文件 |
|
||||
| **线程安全** | 每个数据源独立线程,并发拉取,互不影响 |
|
||||
| **灵活列映射** | 数据库列名不同时可在配置中映射 |
|
||||
| **Dry-run** | `--dry-run` 模式测试连接,不写文件 |
|
||||
| **单次模式** | `--once` 模式跑完一轮即退出,适合 cron 场景 |
|
||||
| **systemd 集成** | 提供 `.service` 文件,支持开机自启 |
|
||||
|
||||
---
|
||||
|
||||
## 输出格式
|
||||
|
||||
每行一条日志,格式如下(RFC 5424 变体):
|
||||
|
||||
```
|
||||
<priority>1 2026-05-12T15:30:00.123456+08:00 hostname app_name[12345]: [trace=abc123] [span=def456] 日志内容
|
||||
```
|
||||
|
||||
| 字段 | 说明 |
|
||||
|------|------|
|
||||
| `<priority>` | syslog 优先级,`<6>`=INFO `<4>`=WARN `<3>`=ERROR 等 |
|
||||
| `version` | RFC 5424 版本号,始终为 `1` |
|
||||
| `timestamp` | ISO 8601 时间(含微秒和时区) |
|
||||
| `hostname` | 配置文件中的 hostname(默认系统 hostname) |
|
||||
| `app_name[pid]` | 配置的 app_name + 本进程 PID |
|
||||
| `[trace=...] [span=...]` | 结构化数据(trace_id / span_id 等) |
|
||||
| `message` | 日志正文 |
|
||||
|
||||
---
|
||||
|
||||
## 快速开始
|
||||
|
||||
### 1. 安装依赖
|
||||
|
||||
```bash
|
||||
# CentOS / RHEL / Fedora
|
||||
sudo yum install -y python3 python3-pip
|
||||
sudo pip3 install PyMySQL psycopg2-binary PyYAML
|
||||
|
||||
# Debian / Ubuntu
|
||||
sudo apt-get install -y python3 python3-pip
|
||||
sudo pip3 install PyMySQL psycopg2-binary PyYAML
|
||||
```
|
||||
|
||||
### 2. 下载程序
|
||||
|
||||
```bash
|
||||
sudo mkdir -p /opt/db_log_exporter
|
||||
sudo curl -L https://your-repo/db_log_exporter.py \
|
||||
-o /opt/db_log_exporter/db_log_exporter.py
|
||||
sudo chmod +x /opt/db_log_exporter/db_log_exporter.py
|
||||
```
|
||||
|
||||
### 3. 编写配置
|
||||
|
||||
```bash
|
||||
sudo mkdir -p /etc/db_log_exporter
|
||||
sudo nano /etc/db_log_exporter/config.yaml
|
||||
```
|
||||
|
||||
参考 `config.yaml.example`(本仓库根目录),主要填:
|
||||
|
||||
```yaml
|
||||
databases:
|
||||
mysql_prod:
|
||||
type: mysql
|
||||
host: 192.168.1.100
|
||||
port: 3306
|
||||
user: log_reader
|
||||
password: "your_password"
|
||||
database: app_logs
|
||||
|
||||
sources:
|
||||
- name: access_log
|
||||
database: mysql_prod
|
||||
table: access_log
|
||||
columns:
|
||||
id: id
|
||||
timestamp: created_at
|
||||
level: log_level
|
||||
message: message
|
||||
```
|
||||
|
||||
### 4. 测试连接(Dry-run)
|
||||
|
||||
```bash
|
||||
python3 /opt/db_log_exporter/db_log_exporter.py \
|
||||
-c /etc/db_log_exporter/config.yaml \
|
||||
--dry-run
|
||||
```
|
||||
|
||||
### 5. 运行方式
|
||||
|
||||
**方式 A:systemd 守护进程(推荐)**
|
||||
|
||||
```bash
|
||||
# 复制 service 文件
|
||||
sudo cp db_log_exporter.service /etc/systemd/system/
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable --now db_log_exporter
|
||||
sudo journalctl -u db_log_exporter -f # 实时查看日志
|
||||
```
|
||||
|
||||
**方式 B:cron 定时任务(单次模式)**
|
||||
|
||||
```bash
|
||||
# crontab -e
|
||||
* * * * * python3 /opt/db_log_exporter/db_log_exporter.py \
|
||||
-c /etc/db_log_exporter/config.yaml --once
|
||||
```
|
||||
|
||||
**方式 C:直接运行(前台守护)**
|
||||
|
||||
```bash
|
||||
python3 /opt/db_log_exporter/db_log_exporter.py \
|
||||
-c /etc/db_log_exporter/config.yaml
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 配置说明
|
||||
|
||||
详细配置参考 `config.yaml.example`,核心字段:
|
||||
|
||||
```yaml
|
||||
global:
|
||||
output_dir: /var/log/db_exporter # 日志输出目录
|
||||
checkpoint_dir: /var/lib/... # 断点文件目录
|
||||
hostname: myserver # syslog hostname
|
||||
interval: 30 # 默认轮询间隔(秒)
|
||||
batch_size: 1000 # 默认每次最多读条数
|
||||
|
||||
databases:
|
||||
别名:
|
||||
type: mysql | postgresql
|
||||
host: ...
|
||||
port: ...
|
||||
user: ...
|
||||
password: ...
|
||||
database: ... # MySQL: database 名
|
||||
dbname: ... # PostgreSQL: dbname
|
||||
charset: utf8mb4 # MySQL 专用
|
||||
|
||||
sources:
|
||||
- name: 源名称 # 唯一标识,用于断点文件名
|
||||
database: 引用上方别名
|
||||
table: 表名
|
||||
log_file: 输出文件名 # 相对于 output_dir
|
||||
app_name: syslog程序名
|
||||
interval: 15 # 覆盖全局间隔
|
||||
batch_size: 500 # 覆盖全局批次大小
|
||||
columns: # 列名映射
|
||||
id: id
|
||||
timestamp: created_at
|
||||
level: log_level
|
||||
message: message
|
||||
trace_id: trace_id # 可选
|
||||
span_id: span_id # 可选
|
||||
```
|
||||
|
||||
### 数据库用户权限(最小权限)
|
||||
|
||||
**MySQL:**
|
||||
```sql
|
||||
CREATE USER 'log_reader'@'%' IDENTIFIED BY 'password';
|
||||
GRANT SELECT ON app_logs.access_log TO 'log_reader'@'%';
|
||||
GRANT SELECT ON app_logs.error_log TO 'log_reader'@'%';
|
||||
FLUSH PRIVILEGES;
|
||||
```
|
||||
|
||||
**PostgreSQL:**
|
||||
```sql
|
||||
CREATE USER log_reader WITH PASSWORD 'password';
|
||||
GRANT CONNECT ON DATABASE app_logs TO log_reader;
|
||||
GRANT SELECT ON application_logs TO log_reader;
|
||||
GRANT SELECT ON audit_log TO log_reader;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 与日志采集系统集成
|
||||
|
||||
### rsyslog(服务器接收)
|
||||
|
||||
```conf
|
||||
# /etc/rsyslog.d/60-db-exporter.conf
|
||||
module(load="imfile" PollingInterval="10")
|
||||
|
||||
input(type="imfile"
|
||||
File="/var/log/db_exporter/mysql_access.log"
|
||||
Tag="db:access:"
|
||||
Facility="local0")
|
||||
```
|
||||
|
||||
### Filebeat(采集到 Elasticsearch)
|
||||
|
||||
```yaml
|
||||
# filebeat.yml
|
||||
filebeat.inputs:
|
||||
- type: log
|
||||
paths:
|
||||
- /var/log/db_exporter/*.log
|
||||
fields:
|
||||
log_type: db_exporter
|
||||
fields_under_root: true
|
||||
```
|
||||
|
||||
### Promtail(Loki 采集)
|
||||
|
||||
```yaml
|
||||
# promtail.yml
|
||||
scrape_configs:
|
||||
- job_name: db_exporter
|
||||
static_configs:
|
||||
- targets: [localhost]
|
||||
labels:
|
||||
job: db_exporter
|
||||
__path__: /var/log/db_exporter/*.log
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 故障排查
|
||||
|
||||
| 问题 | 排查方法 |
|
||||
|------|---------|
|
||||
| 服务启动失败 | `journalctl -u db_log_exporter -e` 查看错误日志 |
|
||||
| 连接被拒绝 | 确认数据库允许该 IP 连接,检查防火墙/安全组 |
|
||||
| 权限不足 | 确认运行用户对 `output_dir`、`checkpoint_dir` 有写权限 |
|
||||
| 日志重复 | 删除对应断点文件并重启,程序会从头拉取 |
|
||||
| 中文乱码 | 确认数据库字符集为 `utf8mb4`(MySQL)或 `UTF8`(PG) |
|
||||
| 连接超时 | 在 `databases` 中加 `connect_timeout: 10` |
|
||||
|
||||
---
|
||||
|
||||
## 目录结构
|
||||
|
||||
```
|
||||
db_log_exporter/
|
||||
├── db_log_exporter.py # 主程序(Python 守护进程)
|
||||
├── config.yaml.example # 配置文件示例
|
||||
├── requirements.txt # Python 依赖
|
||||
├── db_log_exporter.service # systemd 服务文件
|
||||
├── setup.sh # 自动化安装脚本
|
||||
└── README.md # 本文件
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 命令行参数
|
||||
|
||||
```
|
||||
-c, --config YAML 配置文件路径(必填)
|
||||
--once 仅执行一次轮询后退出(适合 cron)
|
||||
--dry-run 仅测试数据库连接,不写文件
|
||||
--log-level 日志级别: DEBUG|INFO|WARNING|ERROR(默认 INFO)
|
||||
--log-file 本程序日志输出文件(默认 stdout)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 安全建议
|
||||
|
||||
1. **数据库密码不要明文写在配置中**,使用环境变量或 systemd secret:
|
||||
```bash
|
||||
# /etc/db_log_exporter/env
|
||||
DB_PASSWORD=your_password
|
||||
```
|
||||
然后在 `config.yaml` 中用 `${DB_PASSWORD}` 引用
|
||||
|
||||
2. **以最小权限用户运行服务**,不要用 root:
|
||||
```bash
|
||||
useradd -r -s /sbin/nologin db_exporter
|
||||
chown -R db_exporter:db_exporter /var/log/db_exporter /var/lib/db_exporter
|
||||
# 修改 service 文件中的 User=db_exporter
|
||||
```
|
||||
|
||||
3. **日志文件及时轮转**,防止磁盘爆满:
|
||||
```bash
|
||||
# /etc/logrotate.d/db_log_exporter
|
||||
/var/log/db_exporter/*.log {
|
||||
daily
|
||||
rotate 7
|
||||
compress
|
||||
delaycompress
|
||||
missingok
|
||||
notifempty
|
||||
create 0644 root root
|
||||
sharedscripts
|
||||
postrotate
|
||||
systemctl reload db_log_exporter > /dev/null 2>&1 || true
|
||||
endscript
|
||||
}
|
||||
```
|
||||
Reference in New Issue
Block a user