Compare commits
10 Commits
0990cb9b66
...
c621228996
Author | SHA1 | Date | |
---|---|---|---|
c621228996 | |||
|
d6def4acd0 | ||
b5bee828ca | |||
d163726756 | |||
ba7f8c0456 | |||
69f883f0a5 | |||
c9ef0ccca3 | |||
02bbaaa788 | |||
75ee065752 | |||
47918d49a1 |
3
.github/workflows/build-and-run.yml
vendored
@ -8,7 +8,8 @@ on:
|
|||||||
jobs:
|
jobs:
|
||||||
build-and-run:
|
build-and-run:
|
||||||
runs-on: ubuntu-latest
|
runs-on: ubuntu-latest
|
||||||
|
permissions:
|
||||||
|
contents: write
|
||||||
steps:
|
steps:
|
||||||
- name: 1. Checkout repository
|
- name: 1. Checkout repository
|
||||||
uses: actions/checkout@v4
|
uses: actions/checkout@v4
|
||||||
|
2
.github/workflows/docker-publish.yml
vendored
@ -3,7 +3,7 @@ name: Publish Docker image
|
|||||||
on:
|
on:
|
||||||
push:
|
push:
|
||||||
branches:
|
branches:
|
||||||
- main
|
- dev
|
||||||
|
|
||||||
jobs:
|
jobs:
|
||||||
publish:
|
publish:
|
||||||
|
239
README.md
@ -4,73 +4,77 @@ EndOfYear 点燃个人博客的年度辉煌!
|
|||||||
|
|
||||||
![EndOfYear](static/endofyear.jpg)
|
![EndOfYear](static/endofyear.jpg)
|
||||||
|
|
||||||
## 用法
|
## 使用方法
|
||||||
|
|
||||||
### 要求
|
### 要求
|
||||||
|
|
||||||
- RSS 源必须输出文章全部内容,否则数据分析不准确。
|
- **确保 RSS 源提供完整的文章内容**:为了保证数据分析的准确性,RSS 源需要输出文章的全部内容。
|
||||||
- Github 运行可能无法访问 RSS 源,请使用本地 Docker 运行。
|
- **在 GitHub 上运行**:由于 GitHub 运行环境可能无法访问某些 RSS 源,请考虑在本地 Docker 环境中运行。
|
||||||
- 如果生成年度报告,请结合博客实际情况设置 RSS 输出文章数量。
|
- **适当设置 RSS 文章数量**:如果您的目的是生成年度报告,请根据博客的实际情况调整 RSS 输出的文章数量。
|
||||||
|
|
||||||
### Github
|
### 在 GitHub 上的使用步骤
|
||||||
|
|
||||||
1. Fork 项目到个人仓库。
|
1. 将项目 Fork 到您的个人仓库。
|
||||||
2. 手动配置仓库的 Workflow permissions 设置为 **Read and write permissions**,否则无法写入 html 分支。
|
|
||||||
1. 导航到 **Settings**(设置)选项卡。
|
|
||||||
2. 在左侧导航栏中,点击 **Actions**(操作)。
|
|
||||||
3. 在 **General**(常规)页面下滑,找到 **Workflow permissions**(工作流权限)。
|
|
||||||
4. 在 **Workflow permissions** 中,选择 **Read and write permissions**(读写权限)。
|
|
||||||
5. 最后点击 **Save**(保存)。
|
|
||||||
3. 在仓库首页打开目录下的 `config.ini` 配置文件,点击右上角工具栏的 **🖋️(钢笔)** 图标,在线编辑文件。
|
|
||||||
- `web`:配置为 `false`。
|
|
||||||
- `rss`:配置为 RSS 地址。
|
|
||||||
|
|
||||||
```ini
|
2. 在仓库首页,找到并打开 `config.ini` 文件。点击右上角的 🖋️ 符号进行在线编辑。
|
||||||
[default]
|
|
||||||
web = false
|
|
||||||
|
|
||||||
[blog]
|
- `web` 字段:将其**设置为 `false` 以启用静态网站模式**(适用于 GitHub 运行)。
|
||||||
rss = https://blog.7wate.com/rss.xml
|
- `rss` 字段:填写您的 RSS 源地址,确保源地址提供全文输出。
|
||||||
data =
|
|
||||||
```
|
|
||||||
|
|
||||||
4. 点击右上角的 **Commit changes** 提交到 `main` 分支,会自动运行 Actions。
|
```ini
|
||||||
5. 等待 Actions 运行成功,将会部署静态网站文件至 `html` 分支。
|
[default]
|
||||||
6. 开启 GitHub 仓库的 Pages 功能,默认为根目录。
|
web = false
|
||||||
7. 访问个人网址,就可以看到啦~
|
|
||||||
|
|
||||||
### Docker
|
[blog]
|
||||||
|
rss = https://blog.7wate.com/rss.xml
|
||||||
|
```
|
||||||
|
|
||||||
1. 拉取 [endofyear](https://hub.docker.com/r/sevenwate/endofyear) 最新镜像。
|
3. 编辑完成后,点击页面右上角的 **Commit changes** 将更改提交到 `main` 分支。
|
||||||
|
|
||||||
```shell
|
4. 提交后,GitHub Actions 会自动运行并生成静态网站文件,最终推送至 `html` 分支。
|
||||||
docker pull sevenwate/endofyear:latest
|
|
||||||
```
|
|
||||||
|
|
||||||
2. 映射容器 7777 端口至宿主机端口,指定 `rss` 环境变量,然后运行 Docker。
|
5. 在 GitHub 仓库的 Settings 中开启 Pages 功能,并将源设置为 `html` 分支的根目录。
|
||||||
|
|
||||||
```shell
|
6. 稍后访问 GitHub Pages 分配的网址,即可看到生成的内容。
|
||||||
# 请将 https://blog.7wate.com/rss.xml 替换为自己的 RSS 地址。
|
|
||||||
docker run -p 7777:7777 --env rss=https://blog.7wate.com/rss.xml sevenwate/endofyear:latest
|
|
||||||
```
|
|
||||||
|
|
||||||
3. 访问网址 `localhost:7777`
|
### 使用 Docker
|
||||||
|
|
||||||
|
1. **拉取 Docker 镜像**:从 [endofyear](https://hub.docker.com/r/sevenwate/endofyear) Docker Hub 页面拉取最新镜像。
|
||||||
|
|
||||||
|
```shell
|
||||||
|
docker pull sevenwate/endofyear:latest
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **运行 Docker 容器**:映射容器的 7777 端口到宿主机的端口,并设置 `rss` 环境变量。
|
||||||
|
|
||||||
|
```shell
|
||||||
|
# 将 RSS 地址替换为您自己的。
|
||||||
|
docker run -p 7777:7777 --env rss=https://blog.7wate.com/rss.xml sevenwate/endofyear:latest
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **访问本地网站**:在浏览器中访问 `localhost:7777`,即可查看结果。
|
||||||
|
|
||||||
## Q&A
|
## Q&A
|
||||||
|
|
||||||
### Github Actions 运行失败
|
### Github Actions 运行失败
|
||||||
|
|
||||||
请查阅 Actions 的第六步输出的 Log 日志排错。
|
请首先检查 Actions 日志的第六步输出,这里包含了导致运行失败的详细错误信息。
|
||||||
|
|
||||||
### Docker 运行无法访问 Web 服务
|
### Docker 运行无法访问 Web 服务
|
||||||
|
|
||||||
1. 请检查**容器映射端口**至宿主机。
|
如果在使用 Docker 时无法访问 Web 服务,请按照以下步骤进行故障排除:
|
||||||
2. 请检查是否配置 **rss 环境变量**。
|
|
||||||
3. 请查看 Docker **运行日志**。
|
1. **检查端口映射**:确保您已正确设置容器的端口映射到宿主机。
|
||||||
|
2. **确认 rss 环境变量**:请检查是否已正确配置 `rss` 环境变量。
|
||||||
|
3. **查看 Docker 日志**:如果以上步骤均无法解决问题,请查看 Docker 容器的运行日志以获取更多信息。
|
||||||
|
|
||||||
### 博客数据分析不准确
|
### 博客数据分析不准确
|
||||||
|
|
||||||
目前会根据个人时间进一步迭代,可以点个 Watch 订阅进度。
|
目前提供的博客数据分析功能已经相对完善且准确。未来,我计划结合 AI 进一步优化分析效果,以提供更精准的数据维度。
|
||||||
|
|
||||||
|
### 主题不够丰富
|
||||||
|
|
||||||
|
由于个人时间有限,目前**我仅能承诺每年末前更新一款主题。**尽管如此,我仍然致力于为您的写作之旅带来愉悦和丰富的体验,并感谢您的理解和支持!
|
||||||
|
|
||||||
## 流程
|
## 流程
|
||||||
|
|
||||||
@ -78,50 +82,115 @@ EndOfYear 通过 RSS 获取博客文章数据,对文章数据进行统计、
|
|||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
sequenceDiagram
|
sequenceDiagram
|
||||||
actor User
|
participant User
|
||||||
participant Flask
|
participant Flask as Flask Server
|
||||||
participant Config
|
participant Config as Configuration
|
||||||
participant Generator
|
participant Generator as Data Generator
|
||||||
participant Scraper
|
participant Scraper as Data Scraper
|
||||||
participant Analyzer
|
participant Analyzer as Data Analyzer
|
||||||
|
|
||||||
|
User->>Flask: Access Service
|
||||||
|
Flask->>User: Redirect to painting theme
|
||||||
|
User->>Flask: Request painting theme
|
||||||
|
Flask->>Config: Invoke Data Generator
|
||||||
|
Config->>Generator: Run Data Scraper
|
||||||
|
Generator->>Scraper: Fetch RSS data
|
||||||
|
Scraper->>Analyzer: Analyze Data
|
||||||
|
Analyzer->>Scraper: Return Analyzed Data
|
||||||
|
Scraper->>Generator: Send Structured Data
|
||||||
|
Generator->>Flask: Return Data to Flask
|
||||||
|
Flask->>Flask: Render HTML Page with Data
|
||||||
|
Flask->>User: Return Rendered HTML Page
|
||||||
|
|
||||||
User ->> Flask: Access service
|
|
||||||
Flask ->> Config: Check cache
|
|
||||||
activate Config
|
|
||||||
alt Cache exists
|
|
||||||
Config -->> Flask: Return cached data
|
|
||||||
else Cache does not exist
|
|
||||||
Config ->> Generator: Run data generator
|
|
||||||
activate Generator
|
|
||||||
Generator ->> Scraper: Run data scraping
|
|
||||||
activate Scraper
|
|
||||||
Scraper -->> Generator: Return scraped data
|
|
||||||
deactivate Scraper
|
|
||||||
Generator ->> Analyzer: Run data analysis
|
|
||||||
activate Analyzer
|
|
||||||
Analyzer -->> Generator: Return analyzed data
|
|
||||||
deactivate Analyzer
|
|
||||||
Generator -->> Config: Return organized data
|
|
||||||
deactivate Generator
|
|
||||||
Config -->> Flask: Return data
|
|
||||||
end
|
|
||||||
Flask -->> User: Return HTML page
|
|
||||||
deactivate Config
|
|
||||||
```
|
```
|
||||||
|
|
||||||
1. 用户访问 Flask 服务。
|
1. 用户访问 Flask 服务。
|
||||||
2. Flask 检查缓存是否存在。
|
2. Flask 根路由跳转 painting 主题。
|
||||||
- 如果缓存存在,Flask直接返回缓存数据。
|
|
||||||
- 如果缓存不存在,继续下一步。
|
|
||||||
3. Config 模块运行数据生成器(Generator)。
|
3. Config 模块运行数据生成器(Generator)。
|
||||||
4. Generator 模块运行数据抓取器(Scraper)来获取RSS数据。
|
4. Generator 模块运行数据抓取器(Scraper)来获取RSS数据。
|
||||||
5. Scraper 将抓取的数据返回给 Generator。
|
5. Scraper 将抓取的数据结合(Analyzer)对数据进行分析。
|
||||||
6. Generator 运行数据分析器(Analyzer)对数据进行分析。
|
6. Analyzer 将分析后的数据返回给 Scraper。
|
||||||
7. Analyzer 将分析后的数据返回给 Generator。
|
7. Generator 整理(Scraper)结构化数据后将其返回给 Flask。
|
||||||
8. Generator 整理结构化数据后将其返回给 Flask,Config 模块。
|
8. Flask 使用(Generator)的数据渲染 HTML 页面。
|
||||||
9. Flask 使用整理后的数据渲染 HTML 页面。
|
9. Flask 返回渲染后的 HTML 页面给用户。
|
||||||
10. Flask 返回渲染后的 HTML 页面给用户。
|
|
||||||
|
|
||||||
|
## 主题开发
|
||||||
|
|
||||||
|
EndOfYear 使用 Python 结合 Flask 利用 Jinja2 模板进行数据渲染,目前提供四个数据模型。
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
erDiagram
|
||||||
|
Site ||--o{ Generator : contains
|
||||||
|
Blog ||--o{ Generator : contains
|
||||||
|
Post ||--o{ Generator : contains
|
||||||
|
Custom ||--o{ Generator : contains
|
||||||
|
|
||||||
|
Site {
|
||||||
|
string service
|
||||||
|
string title
|
||||||
|
}
|
||||||
|
|
||||||
|
Blog {
|
||||||
|
string name
|
||||||
|
string link
|
||||||
|
int life
|
||||||
|
int article_count
|
||||||
|
int article_word_count
|
||||||
|
string top_post_keys
|
||||||
|
string category
|
||||||
|
}
|
||||||
|
|
||||||
|
Post {
|
||||||
|
string title
|
||||||
|
string content
|
||||||
|
string[] keys
|
||||||
|
string time
|
||||||
|
string date
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
Custom {
|
||||||
|
string yiyan
|
||||||
|
}
|
||||||
|
|
||||||
|
Generator {
|
||||||
|
Site site
|
||||||
|
Blog blog
|
||||||
|
Post special_post
|
||||||
|
Post sentiment_post
|
||||||
|
Post long_post
|
||||||
|
Post short_post
|
||||||
|
Custom custom
|
||||||
|
}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
如果进行主题开发可以使用 Jinja2 的模板语言,结合下面的数据定制一款专属主题。
|
||||||
|
|
||||||
|
| 数据 | 描述 |
|
||||||
|
|----------------|---------|
|
||||||
|
| site | 站点数据 |
|
||||||
|
| blog | 博客数据 |
|
||||||
|
| special_post | 特殊日期文件 |
|
||||||
|
| sentiment_post | 情感分最高文章 |
|
||||||
|
| long_post | 篇幅最长文章 |
|
||||||
|
| short_post | 篇幅最短文章 |
|
||||||
|
| custom | 自定义数据 |
|
||||||
|
|
||||||
|
如果有额外数据需求,可以修改 `custom` 模型,并在 `main.py` 中传参,最后在 HTML 模板中使用。以下是一个简单的模板示例:
|
||||||
|
|
||||||
|
```html
|
||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="zh">
|
||||||
|
<head>
|
||||||
|
<meta charset="UTF-8">
|
||||||
|
<title>{{ site.title }}</title>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
亲爱的{{ blog.name }}
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
|
```
|
||||||
|
|
||||||
## 路线图
|
## 路线图
|
||||||
|
|
||||||
@ -129,15 +198,15 @@ EndOfYear 目前处于初始阶段,如果您有兴趣,可以为其做出贡
|
|||||||
|
|
||||||
### V1
|
### V1
|
||||||
|
|
||||||
- [ ] 对博客系统的数据源进行全面、规模性的测试。
|
- [x] 结合互联网公开博客的数据源对 EndOfYear 进行全面、规模性的测试。
|
||||||
- [ ] 进一步细化数据分析维度和数据颗粒度,精准描绘用户画像。
|
- [x] 默认主题进一步细化数据分析维度和数据颗粒度,精准描绘用户画像。
|
||||||
- [ ] 渲染数据的规范,约束主题开发,提高主题的兼容性。
|
- [x] EndOfYear 渲染数据的规范,约束主题开发,提高主题的兼容性。
|
||||||
- [ ] 剥离数据分析和主题,提供更好地适用方式。
|
- [x] 进一步丰富和完善主题。
|
||||||
|
|
||||||
### V2
|
### V2
|
||||||
|
|
||||||
- [ ] 进一步丰富和完善主题。
|
- [ ] 剥离主题,提供更好地主题开发方式。
|
||||||
- [ ] EndOfYear 项目展示首页,使用文档,主题开发等。
|
- [ ] EndOfYear 项目网站首页,使用文档,主题开发等。
|
||||||
- [ ] 实现轻量化的运行部署,一键运行。
|
- [ ] 实现轻量化的运行部署,一键运行。
|
||||||
- [ ] 探索以插件的方式附加到博客系统的方法。
|
- [ ] 探索以插件的方式附加到博客系统的方法。
|
||||||
|
|
||||||
|
@ -3,4 +3,3 @@ web = true
|
|||||||
|
|
||||||
[blog]
|
[blog]
|
||||||
rss =
|
rss =
|
||||||
data =
|
|
3117
data/stop_words.txt
66
main.py
@ -1,8 +1,11 @@
|
|||||||
from flask import Flask, render_template, redirect, url_for
|
from flask import Flask, render_template, redirect, url_for
|
||||||
from loguru import logger
|
from loguru import logger
|
||||||
|
|
||||||
|
from src import const
|
||||||
|
from src import models
|
||||||
|
from src import tools
|
||||||
from src.config import Config
|
from src.config import Config
|
||||||
from src.generator import build_data
|
from src.generator import Generator
|
||||||
|
|
||||||
app = Flask(__name__)
|
app = Flask(__name__)
|
||||||
logger.add("endofyear.log")
|
logger.add("endofyear.log")
|
||||||
@ -10,28 +13,63 @@ logger.add("endofyear.log")
|
|||||||
|
|
||||||
@app.route('/')
|
@app.route('/')
|
||||||
def home():
|
def home():
|
||||||
# 默认主题 painting
|
# 重定向 painting
|
||||||
return redirect(url_for('painting'))
|
return redirect(url_for('painting'))
|
||||||
|
|
||||||
|
|
||||||
@app.route('/painting')
|
@app.route('/painting')
|
||||||
def painting():
|
def painting():
|
||||||
if Config("config.ini").web_status:
|
# 读取配置文件
|
||||||
# web 服务
|
config = Config("config.ini")
|
||||||
# 如果数据存在,直接返回
|
|
||||||
if blog_data := Config("config.ini").blog_data:
|
# 站点数据
|
||||||
return render_template('painting.html', data=blog_data, web_status=1)
|
site = models.Site(
|
||||||
|
service=config.web_status,
|
||||||
|
title=const.SITE_NAME
|
||||||
|
).to_dict()
|
||||||
|
|
||||||
|
# 自定义数据
|
||||||
|
custom = models.Custom(
|
||||||
|
yiyan=tools.get_yiyan()
|
||||||
|
).to_dict()
|
||||||
|
|
||||||
|
# 初始化数据生成器
|
||||||
|
generator = Generator(config.rss_url)
|
||||||
|
logger.info(f"Site: {site}")
|
||||||
|
logger.info(f"Blog: {generator.blog()}")
|
||||||
|
logger.info(f"Special Post: {generator.special_post()}")
|
||||||
|
logger.info(f"Sentiment Post: {generator.sentiment_post()}")
|
||||||
|
logger.info(f"Long Post: {generator.long_post()}")
|
||||||
|
logger.info(f"Short Post: {generator.short_post()}")
|
||||||
|
|
||||||
|
# 服务模式
|
||||||
|
if config.web_status == const.SITE_SERVICE_STATIC:
|
||||||
|
# 静态网站模式
|
||||||
|
html_static_file = render_template('painting.html',
|
||||||
|
site=site,
|
||||||
|
blog=generator.blog(),
|
||||||
|
special_post=generator.special_post(),
|
||||||
|
sentiment_post=generator.sentiment_post(),
|
||||||
|
long_post=generator.long_post(),
|
||||||
|
short_post=generator.short_post(),
|
||||||
|
custom=custom
|
||||||
|
)
|
||||||
|
|
||||||
# 如果数据不存在,需要生成,并写入配置
|
|
||||||
return render_template('painting.html', data=build_data(), web_status=1)
|
|
||||||
else:
|
|
||||||
# Github 静态
|
|
||||||
# 数据需要生成,并写入静态文件
|
|
||||||
html_data = render_template('painting.html', data=build_data(), web_status=0)
|
|
||||||
with open("static/index.html", "w") as f:
|
with open("static/index.html", "w") as f:
|
||||||
f.write(html_data)
|
f.write(html_static_file)
|
||||||
|
|
||||||
return 'OK'
|
return 'OK'
|
||||||
|
else:
|
||||||
|
# web 模式
|
||||||
|
return render_template('painting.html',
|
||||||
|
site=site,
|
||||||
|
blog=generator.blog(),
|
||||||
|
special_post=generator.special_post(),
|
||||||
|
sentiment_post=generator.sentiment_post(),
|
||||||
|
long_post=generator.long_post(),
|
||||||
|
short_post=generator.short_post(),
|
||||||
|
custom=custom
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
|
@ -1,8 +1,9 @@
|
|||||||
beautifulsoup4==4.12.2
|
beautifulsoup4==4.12.2
|
||||||
blinker==1.6.3
|
blinker==1.7.0
|
||||||
certifi==2023.7.22
|
certifi==2023.7.22
|
||||||
charset-normalizer==3.3.1
|
charset-normalizer==3.3.2
|
||||||
click==8.1.7
|
click==8.1.7
|
||||||
|
docopt==0.6.2
|
||||||
feedparser==6.0.10
|
feedparser==6.0.10
|
||||||
Flask==3.0.0
|
Flask==3.0.0
|
||||||
idna==3.4
|
idna==3.4
|
||||||
|
145
src/analyzer.py
@ -1,5 +1,3 @@
|
|||||||
from typing import Any
|
|
||||||
|
|
||||||
import jieba.analyse
|
import jieba.analyse
|
||||||
import pytz
|
import pytz
|
||||||
from dateutil.parser import parse
|
from dateutil.parser import parse
|
||||||
@ -7,31 +5,52 @@ from loguru import logger
|
|||||||
from lunardate import LunarDate
|
from lunardate import LunarDate
|
||||||
from snownlp import SnowNLP
|
from snownlp import SnowNLP
|
||||||
|
|
||||||
|
from . import const
|
||||||
|
|
||||||
|
|
||||||
# 计算文本内容情感分数
|
# 计算文本内容情感分数
|
||||||
def analyze_sentiment(text):
|
def analyze_sentiment(keys):
|
||||||
"""
|
"""
|
||||||
博客文章情感分计算(有点问题,酌情使用)
|
博客文章情感分计算
|
||||||
:param text:文章文本
|
|
||||||
|
:param keys:文章关键字
|
||||||
:return:分数
|
:return:分数
|
||||||
"""
|
"""
|
||||||
s = SnowNLP(text)
|
score_lists = [SnowNLP(key).sentiments for key in keys]
|
||||||
return round(s.sentiments * 100)
|
all_score = sum(score_lists)
|
||||||
|
|
||||||
|
if len(score_lists) > 10:
|
||||||
|
max_score = max(score_lists)
|
||||||
|
min_score = min(score_lists)
|
||||||
|
average_score = (all_score - max_score - min_score) / (len(keys) - 2)
|
||||||
|
return int(average_score * 1000)
|
||||||
|
elif 10 > len(score_lists) > 6:
|
||||||
|
average_score = all_score / len(keys)
|
||||||
|
return int(average_score * 900)
|
||||||
|
elif 6 > len(score_lists) > 3:
|
||||||
|
average_score = all_score / len(keys)
|
||||||
|
return int(average_score * 800)
|
||||||
|
elif 3 > len(score_lists) > 0:
|
||||||
|
average_score = all_score / len(keys)
|
||||||
|
return int(average_score * 500)
|
||||||
|
else:
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
def classify_and_extract_keywords(text: str, topK: int, stopwords: str,
|
def extract_keywords(text,
|
||||||
tech_terms_file: str) -> tuple[None, list[Any]] | tuple[int, Any]:
|
topK,
|
||||||
|
stopwords):
|
||||||
"""
|
"""
|
||||||
博客文章关键字提取
|
文章关键字提取
|
||||||
:param text:文章文本
|
:param text:文章文本
|
||||||
:param topK:关键字数量,建议20个
|
:param topK:关键字数量
|
||||||
:param stopwords:停词文本,去掉无意义词组
|
:param stopwords:停词文本(去掉无意义词组)
|
||||||
:param tech_terms_file:专业词语,区分文章类目
|
|
||||||
:return:
|
:return:
|
||||||
"""
|
"""
|
||||||
try:
|
try:
|
||||||
jieba.analyse.set_stop_words(stopwords)
|
jieba.analyse.set_stop_words(stopwords)
|
||||||
keywords = jieba.analyse.extract_tags(text, topK=topK)
|
keywords = jieba.analyse.extract_tags(text, topK=topK)
|
||||||
|
return keywords
|
||||||
except ValueError as e:
|
except ValueError as e:
|
||||||
logger.error(f"关键词提取出错:{e}")
|
logger.error(f"关键词提取出错:{e}")
|
||||||
return None, []
|
return None, []
|
||||||
@ -39,72 +58,52 @@ def classify_and_extract_keywords(text: str, topK: int, stopwords: str,
|
|||||||
logger.error(f"关键词提取出错:{e}")
|
logger.error(f"关键词提取出错:{e}")
|
||||||
return None, []
|
return None, []
|
||||||
|
|
||||||
|
|
||||||
|
def check_category(tech_terms_file, keywords):
|
||||||
|
"""
|
||||||
|
文章分类判断
|
||||||
|
:param keywords: 文章关键词
|
||||||
|
:param tech_terms_file: 分类词典文件
|
||||||
|
:return: 分类常量
|
||||||
|
"""
|
||||||
with open(tech_terms_file, 'r', encoding='utf-8') as f:
|
with open(tech_terms_file, 'r', encoding='utf-8') as f:
|
||||||
tech_terms_set = {line.strip().lower() for line in f}
|
tech_terms_set = {line.strip().lower() for line in f} # 读取分类词典文件,将其转化为小写并创建集合
|
||||||
|
|
||||||
for keyword in keywords:
|
for keyword in keywords:
|
||||||
if keyword.lower() in tech_terms_set:
|
if keyword.lower() in tech_terms_set: # 判断关键词是否在分类词典集合中
|
||||||
return 1, keywords
|
return const.BLOG_POST_CATEGORY_TECH # 若关键词存在,则返回技术类分类常量
|
||||||
|
|
||||||
return 2, keywords
|
return const.BLOG_POST_CATEGORY_LIFE # 若关键词不存在,则返回生活类分类常量
|
||||||
|
|
||||||
|
|
||||||
def calculate_weight(time_str: str):
|
def calculate_weight(time_str: str) -> int:
|
||||||
"""
|
"""
|
||||||
博客文章特殊日期权重分数计算。
|
计算文章特殊日期的权重分数。
|
||||||
- 传统节假日 +10
|
- 传统节假日 +10
|
||||||
- 节假日 +7
|
- 节假日 +7
|
||||||
- 凌晨 +5
|
- 凌晨 +5
|
||||||
- 早上 +4
|
- 早上 +4
|
||||||
- 下午 +3
|
- 下午 +3
|
||||||
- 晚上 +2
|
- 晚上 +2
|
||||||
|
|
||||||
:param time_str: 时间字符串
|
:param time_str: 时间字符串
|
||||||
:return:总分数,特殊日期
|
:return: 总分数(整数)
|
||||||
"""
|
"""
|
||||||
dt = parse(time_str)
|
dt = parse(time_str)
|
||||||
dt = dt.astimezone(pytz.timezone('Asia/Shanghai'))
|
dt = dt.astimezone(pytz.timezone(const.TIME_ZONE))
|
||||||
|
|
||||||
weight = 0
|
weight = 0
|
||||||
date_str = ""
|
|
||||||
|
|
||||||
# 农历节日权重计算
|
|
||||||
LUNAR_HOLIDAYS = {
|
|
||||||
(1, 1): '春节',
|
|
||||||
(1, 15): '元宵节',
|
|
||||||
(2, 2): '龙抬头',
|
|
||||||
(5, 5): '端午节',
|
|
||||||
(7, 7): '七夕节',
|
|
||||||
(7, 15): '中元节',
|
|
||||||
(8, 15): '中秋节',
|
|
||||||
(9, 9): '重阳节',
|
|
||||||
(12, 8): '腊八节',
|
|
||||||
(12, 23): '小年',
|
|
||||||
(12, 30): '除夕'
|
|
||||||
}
|
|
||||||
|
|
||||||
|
# 计算农历节假日的权重
|
||||||
lunar_date = LunarDate.fromSolarDate(dt.year, dt.month, dt.day)
|
lunar_date = LunarDate.fromSolarDate(dt.year, dt.month, dt.day)
|
||||||
if (lunar_date.month, lunar_date.day) in LUNAR_HOLIDAYS:
|
if (lunar_date.month, lunar_date.day) in const.LUNAR_HOLIDAYS:
|
||||||
weight += 10
|
weight += 10
|
||||||
date_str = LUNAR_HOLIDAYS[(lunar_date.month, lunar_date.day)]
|
|
||||||
|
|
||||||
# 公历节日权重计算
|
# 计算公历节假日的权重
|
||||||
SOLAR_HOLIDAYS = {
|
if (dt.month, dt.day) in const.SOLAR_HOLIDAYS:
|
||||||
(1, 1): '元旦',
|
|
||||||
(2, 14): '情人节',
|
|
||||||
(3, 8): '国际妇女节',
|
|
||||||
(4, 4): '清明节',
|
|
||||||
(5, 1): '国际劳动节',
|
|
||||||
(10, 1): '国庆节',
|
|
||||||
(12, 13): '南京大屠杀纪念日',
|
|
||||||
(9, 18): '九一八事变纪念日',
|
|
||||||
(12, 7): '南京保卫战胜利纪念日',
|
|
||||||
(8, 15): '抗日战争胜利纪念日'
|
|
||||||
}
|
|
||||||
|
|
||||||
if (dt.month, dt.day) in SOLAR_HOLIDAYS:
|
|
||||||
weight += 7
|
weight += 7
|
||||||
date_str = SOLAR_HOLIDAYS[(dt.month, dt.day)]
|
|
||||||
|
|
||||||
|
# 计算时间节点的权重
|
||||||
if 22 <= dt.hour or dt.hour < 7:
|
if 22 <= dt.hour or dt.hour < 7:
|
||||||
weight += 5
|
weight += 5
|
||||||
elif 7 <= dt.hour < 12:
|
elif 7 <= dt.hour < 12:
|
||||||
@ -116,7 +115,25 @@ def calculate_weight(time_str: str):
|
|||||||
else:
|
else:
|
||||||
weight += 0
|
weight += 0
|
||||||
|
|
||||||
if not date_str:
|
return weight
|
||||||
date_str = f"{dt.month}月{dt.day}日"
|
|
||||||
|
|
||||||
return weight, date_str
|
|
||||||
|
def special_date_calculation(time_str):
|
||||||
|
"""
|
||||||
|
特殊日期计算。
|
||||||
|
:param time_str: 时间字符串
|
||||||
|
:return:总分数
|
||||||
|
"""
|
||||||
|
dt = parse(time_str)
|
||||||
|
dt = dt.astimezone(pytz.timezone(const.TIME_ZONE))
|
||||||
|
|
||||||
|
# 农历节假日计算
|
||||||
|
lunar_date = LunarDate.fromSolarDate(dt.year, dt.month, dt.day)
|
||||||
|
if (lunar_date.month, lunar_date.day) in const.LUNAR_HOLIDAYS:
|
||||||
|
return const.LUNAR_HOLIDAYS[(lunar_date.month, lunar_date.day)]
|
||||||
|
|
||||||
|
# 公历节假日计算
|
||||||
|
if (dt.month, dt.day) in const.SOLAR_HOLIDAYS:
|
||||||
|
return const.SOLAR_HOLIDAYS[(dt.month, dt.day)]
|
||||||
|
|
||||||
|
return f"{dt.month}月{dt.day}日"
|
||||||
|
@ -1,11 +1,11 @@
|
|||||||
import configparser
|
import configparser
|
||||||
import json
|
|
||||||
import os
|
import os
|
||||||
from urllib.parse import urlparse
|
from urllib.parse import urlparse
|
||||||
|
|
||||||
from loguru import logger
|
from loguru import logger
|
||||||
|
|
||||||
from src.tools import check_website_status
|
from . import const
|
||||||
|
from .tools import check_website_status
|
||||||
|
|
||||||
|
|
||||||
class Config:
|
class Config:
|
||||||
@ -31,8 +31,6 @@ class Config:
|
|||||||
logger.error(f"没有权限读取配置文件 {self.path}: {str(e)}")
|
logger.error(f"没有权限读取配置文件 {self.path}: {str(e)}")
|
||||||
raise
|
raise
|
||||||
|
|
||||||
logger.info(f"配置文件 {self.path} 加载成功")
|
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def rss_url(self):
|
def rss_url(self):
|
||||||
try:
|
try:
|
||||||
@ -71,30 +69,6 @@ class Config:
|
|||||||
|
|
||||||
return '.'.join(domain_parts[-2:])
|
return '.'.join(domain_parts[-2:])
|
||||||
|
|
||||||
@property
|
|
||||||
def blog_data(self):
|
|
||||||
try:
|
|
||||||
data = self.config.get('blog', 'data', fallback=None)
|
|
||||||
except configparser.NoSectionError:
|
|
||||||
logger.error('未找到 section 配置项,请检查拼写')
|
|
||||||
return None
|
|
||||||
|
|
||||||
if not data:
|
|
||||||
logger.error('data 配置值为空')
|
|
||||||
return None
|
|
||||||
|
|
||||||
return json.loads(data)
|
|
||||||
|
|
||||||
@blog_data.setter
|
|
||||||
def blog_data(self, value):
|
|
||||||
if not self.config.has_section('blog'):
|
|
||||||
self.config.add_section('blog')
|
|
||||||
|
|
||||||
self.config.set('blog', 'data', json.dumps(value))
|
|
||||||
|
|
||||||
with open(self.path, 'w') as configfile:
|
|
||||||
self.config.write(configfile)
|
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def web_status(self):
|
def web_status(self):
|
||||||
try:
|
try:
|
||||||
@ -105,10 +79,10 @@ class Config:
|
|||||||
|
|
||||||
if web_status is None:
|
if web_status is None:
|
||||||
logger.error('web 配置值为空')
|
logger.error('web 配置值为空')
|
||||||
return True
|
return const.SITE_SERVICE_WEB
|
||||||
|
|
||||||
if web_status == "True" or web_status == "true":
|
if web_status == "True" or web_status == "true" or web_status == "t" or web_status == "T":
|
||||||
return True
|
return const.SITE_SERVICE_WEB
|
||||||
|
|
||||||
if web_status == "False" or web_status == "false":
|
if web_status == "False" or web_status == "false" or web_status == "f" or web_status == "F":
|
||||||
return False
|
return const.SITE_SERVICE_STATIC
|
||||||
|
50
src/const.py
Normal file
@ -0,0 +1,50 @@
|
|||||||
|
# 站点标题
|
||||||
|
SITE_NAME = "EndOfYear"
|
||||||
|
|
||||||
|
# 站点服务模式
|
||||||
|
SITE_SERVICE_WEB = 1
|
||||||
|
SITE_SERVICE_STATIC = 0
|
||||||
|
|
||||||
|
# 时区
|
||||||
|
TIME_ZONE = "Asia/Shanghai"
|
||||||
|
|
||||||
|
# 时间格式
|
||||||
|
FORMAT_TIME = "%Y-%m-%d %H:%M:%S"
|
||||||
|
|
||||||
|
# 博客文章分类-生活
|
||||||
|
BLOG_POST_CATEGORY_LIFE = 1
|
||||||
|
|
||||||
|
# 博客文章分类-技术
|
||||||
|
BLOG_POST_CATEGORY_TECH = 2
|
||||||
|
|
||||||
|
# 博客文章关键字数量
|
||||||
|
BLOG_MAX_KEYS = 7
|
||||||
|
|
||||||
|
# 农历节假日
|
||||||
|
LUNAR_HOLIDAYS = {
|
||||||
|
(1, 1): '春节',
|
||||||
|
(1, 15): '元宵节',
|
||||||
|
(2, 2): '龙抬头',
|
||||||
|
(5, 5): '端午节',
|
||||||
|
(7, 7): '七夕节',
|
||||||
|
(7, 15): '中元节',
|
||||||
|
(8, 15): '中秋节',
|
||||||
|
(9, 9): '重阳节',
|
||||||
|
(12, 8): '腊八节',
|
||||||
|
(12, 23): '小年',
|
||||||
|
(12, 30): '除夕'
|
||||||
|
}
|
||||||
|
|
||||||
|
# 公历节假日
|
||||||
|
SOLAR_HOLIDAYS = {
|
||||||
|
(1, 1): '元旦',
|
||||||
|
(2, 14): '情人节',
|
||||||
|
(3, 8): '妇女节',
|
||||||
|
(4, 4): '清明节',
|
||||||
|
(5, 1): '劳动节',
|
||||||
|
(10, 1): '国庆节',
|
||||||
|
(12, 13): '南京大屠杀纪念日',
|
||||||
|
(9, 18): '九一八事变纪念日',
|
||||||
|
(12, 7): '南京保卫战胜利纪念日',
|
||||||
|
(8, 15): '抗日战争胜利纪念日'
|
||||||
|
}
|
198
src/generator.py
@ -1,95 +1,119 @@
|
|||||||
from collections import Counter
|
from functools import lru_cache
|
||||||
|
|
||||||
from loguru import logger
|
from loguru import logger
|
||||||
|
|
||||||
from .analyzer import analyze_sentiment, calculate_weight, classify_and_extract_keywords
|
from . import models
|
||||||
from .config import Config
|
from . import scraper
|
||||||
from .scraper import Blog
|
|
||||||
|
|
||||||
|
|
||||||
def build_data():
|
@lru_cache(maxsize=None)
|
||||||
"""
|
class Generator:
|
||||||
目前只有一个主题,构建数据部分后期会再进行重构拆分
|
|
||||||
:return: 网页渲染数据
|
|
||||||
"""
|
|
||||||
# 读取配置
|
|
||||||
config = Config("config.ini")
|
|
||||||
|
|
||||||
# 创建博客对象
|
def __init__(self, rss):
|
||||||
try:
|
"""
|
||||||
my_blog = Blog(config.rss_url)
|
初始化Generator类
|
||||||
except Exception as e:
|
:param rss: RSS链接
|
||||||
logger.error(f"Feed 无法创建博客对象: {str(e)}")
|
"""
|
||||||
|
try:
|
||||||
|
self._my_blog = scraper.Blog(rss)
|
||||||
|
logger.debug(self._my_blog)
|
||||||
|
for i, post in enumerate(self._my_blog.post_lists, 1):
|
||||||
|
logger.info(f"Post #{i}:")
|
||||||
|
logger.info(post)
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Generator 无法创建 Blog 对象: {str(e)}")
|
||||||
|
|
||||||
|
def blog(self):
|
||||||
|
"""
|
||||||
|
获取博客信息
|
||||||
|
:return: Blog字典
|
||||||
|
"""
|
||||||
|
return models.Blog(
|
||||||
|
name=self._my_blog.title,
|
||||||
|
link=self._my_blog.link,
|
||||||
|
life=self._my_blog.life,
|
||||||
|
article_count=self._my_blog.article_count,
|
||||||
|
article_word_count=self._my_blog.article_word_count,
|
||||||
|
top_post_keys=self._my_blog.keys,
|
||||||
|
category=self._my_blog.category
|
||||||
|
).to_dict()
|
||||||
|
|
||||||
|
def special_post(self):
|
||||||
|
"""
|
||||||
|
获取特殊日期的文章
|
||||||
|
:return: Post字典
|
||||||
|
"""
|
||||||
|
max_item_special_date = self._get_post_with_max("special_date_score")
|
||||||
|
return models.Post(
|
||||||
|
title=max_item_special_date.title,
|
||||||
|
content=max_item_special_date.content,
|
||||||
|
keys=max_item_special_date.keys,
|
||||||
|
time=max_item_special_date.time,
|
||||||
|
date=max_item_special_date.date
|
||||||
|
).to_dict()
|
||||||
|
|
||||||
|
def sentiment_post(self):
|
||||||
|
"""
|
||||||
|
获取情感最优文章
|
||||||
|
:return: Post字典
|
||||||
|
"""
|
||||||
|
max_item_sentiment = self._get_post_with_max("sentiment_score")
|
||||||
|
return models.Post(
|
||||||
|
title=max_item_sentiment.title,
|
||||||
|
content=max_item_sentiment.content,
|
||||||
|
keys=max_item_sentiment.keys,
|
||||||
|
time=max_item_sentiment.time,
|
||||||
|
date=max_item_sentiment.date
|
||||||
|
).to_dict()
|
||||||
|
|
||||||
|
def long_post(self):
|
||||||
|
"""
|
||||||
|
获取最长文章数据
|
||||||
|
:return: Post字典
|
||||||
|
"""
|
||||||
|
max_item_long = self._get_post_with_max("word_count")
|
||||||
|
return models.Post(
|
||||||
|
title=max_item_long.title,
|
||||||
|
content=max_item_long.content,
|
||||||
|
keys=max_item_long.keys,
|
||||||
|
time=max_item_long.time,
|
||||||
|
date=max_item_long.date,
|
||||||
|
).to_dict()
|
||||||
|
|
||||||
|
def short_post(self):
|
||||||
|
"""
|
||||||
|
获取最短文章数据
|
||||||
|
:return: Post字典
|
||||||
|
"""
|
||||||
|
max_item_short = self._get_post_with_min("word_count")
|
||||||
|
return models.Post(
|
||||||
|
title=max_item_short.title,
|
||||||
|
content=max_item_short.content,
|
||||||
|
keys=max_item_short.keys,
|
||||||
|
time=max_item_short.time,
|
||||||
|
date=max_item_short.date,
|
||||||
|
).to_dict()
|
||||||
|
|
||||||
|
def _get_post_with_max(self, score_attr):
|
||||||
|
"""
|
||||||
|
获取具有最大属性值的文章
|
||||||
|
:param score_attr: 属性
|
||||||
|
:return:
|
||||||
|
"""
|
||||||
|
max_score = max(getattr(post, score_attr) for post in self._my_blog.post_lists)
|
||||||
|
max_posts = [post for post in self._my_blog.post_lists if getattr(post, score_attr) == max_score]
|
||||||
|
if max_posts:
|
||||||
|
return max_posts[0]
|
||||||
return None
|
return None
|
||||||
|
|
||||||
logger.debug(my_blog)
|
def _get_post_with_min(self, score_attr):
|
||||||
|
"""
|
||||||
# 构建博客基本数据
|
获取具有最小属性值的文章
|
||||||
data = {
|
:param score_attr:
|
||||||
"blog_name": my_blog.title,
|
:return:
|
||||||
"blog_link": my_blog.link,
|
"""
|
||||||
"blog_article_count": my_blog.article_count,
|
min_score = min(getattr(post, score_attr) for post in self._my_blog.post_lists)
|
||||||
"blog_article_word_count": my_blog.article_word_count,
|
min_posts = [post for post in self._my_blog.post_lists if getattr(post, score_attr) == min_score]
|
||||||
}
|
if min_posts:
|
||||||
|
return min_posts[0]
|
||||||
if my_blog.life is None:
|
return None
|
||||||
data.update({
|
|
||||||
"blog_life": 0
|
|
||||||
})
|
|
||||||
else:
|
|
||||||
data.update({
|
|
||||||
"blog_life_year": my_blog.life // 365,
|
|
||||||
"blog_life_day": my_blog.life % 365,
|
|
||||||
})
|
|
||||||
|
|
||||||
# 博客文章处理
|
|
||||||
for i, post in enumerate(my_blog.post_lists(), 1):
|
|
||||||
# 情感分
|
|
||||||
post.score = analyze_sentiment(post.content)
|
|
||||||
# 分类, 关键字
|
|
||||||
post.category, post.keys = classify_and_extract_keywords(text=post.content, topK=21,
|
|
||||||
stopwords='data/stop_words.txt',
|
|
||||||
tech_terms_file='data/tech_terms.txt')
|
|
||||||
# 权重, 日子计算
|
|
||||||
post.weight, post.date = calculate_weight(post.time)
|
|
||||||
|
|
||||||
logger.info(f"Post #{i}:")
|
|
||||||
logger.info(post)
|
|
||||||
|
|
||||||
# 博客文章权重计算
|
|
||||||
weights = [post.weight for post in my_blog.post_lists()]
|
|
||||||
max_weight = max(weights)
|
|
||||||
max_item = [post for post in my_blog.post_lists() if post.weight == max_weight][0]
|
|
||||||
|
|
||||||
data.update({
|
|
||||||
"blog_title": max_item.title,
|
|
||||||
"blog_content": max_item.content[0:50],
|
|
||||||
"blog_content_date": max_item.date,
|
|
||||||
})
|
|
||||||
|
|
||||||
# 暂时只有一个主题
|
|
||||||
# 博客关键词计算 5 个
|
|
||||||
all_keys = []
|
|
||||||
for post in my_blog.post_lists():
|
|
||||||
all_keys.extend(post.keys)
|
|
||||||
|
|
||||||
keyword_counts = Counter(all_keys)
|
|
||||||
top_keywords = keyword_counts.most_common(5)
|
|
||||||
data.update({
|
|
||||||
"blog_top_keywords": top_keywords
|
|
||||||
})
|
|
||||||
|
|
||||||
# 博客分类计算
|
|
||||||
categories = [post.category for post in my_blog.post_lists()]
|
|
||||||
cat_counts = Counter(categories)
|
|
||||||
most_common_cat = cat_counts.most_common(1)[0][0]
|
|
||||||
|
|
||||||
data.update({
|
|
||||||
"blog_category": "技术" if most_common_cat == 1 else "生活"
|
|
||||||
})
|
|
||||||
|
|
||||||
# 输出
|
|
||||||
logger.debug(data)
|
|
||||||
# 写入 config.ini 避免重复计算
|
|
||||||
config.blog_data = data
|
|
||||||
return data
|
|
||||||
|
80
src/models.py
Normal file
@ -0,0 +1,80 @@
|
|||||||
|
from dataclasses import dataclass
|
||||||
|
from enum import Enum
|
||||||
|
from typing import List
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Site:
|
||||||
|
"""
|
||||||
|
站点数据模型
|
||||||
|
- service: 服务模式
|
||||||
|
- title: 站点标题
|
||||||
|
"""
|
||||||
|
service: int
|
||||||
|
title: str
|
||||||
|
|
||||||
|
def to_dict(self) -> dict:
|
||||||
|
"""
|
||||||
|
将Site对象转换为字典形式
|
||||||
|
"""
|
||||||
|
return {k: v if not isinstance(v, Enum) else v.value for k, v in vars(self).items()}
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Blog:
|
||||||
|
"""
|
||||||
|
博客数据模型
|
||||||
|
- name:名称
|
||||||
|
- link:链接
|
||||||
|
- life:域名注册天数
|
||||||
|
- article_count:博客文章总和
|
||||||
|
- article_word_count:博客文章字数总和
|
||||||
|
- top_post_keys:博客关键字
|
||||||
|
- category:博客分类
|
||||||
|
"""
|
||||||
|
name: str
|
||||||
|
link: str
|
||||||
|
life: int
|
||||||
|
article_count: int
|
||||||
|
article_word_count: int
|
||||||
|
top_post_keys: List[str]
|
||||||
|
category: int
|
||||||
|
|
||||||
|
def to_dict(self) -> dict:
|
||||||
|
"""
|
||||||
|
将Blog对象转换为字典形式
|
||||||
|
"""
|
||||||
|
return {k: v if not isinstance(v, Enum) else v.value for k, v in vars(self).items()}
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Post:
|
||||||
|
"""
|
||||||
|
文章数据模型
|
||||||
|
- title:标题
|
||||||
|
- content:内容
|
||||||
|
- keys:关键字列表
|
||||||
|
- date:日期字符串
|
||||||
|
"""
|
||||||
|
title: str
|
||||||
|
content: str
|
||||||
|
keys: List[str]
|
||||||
|
time: str
|
||||||
|
date: str
|
||||||
|
|
||||||
|
def to_dict(self) -> dict:
|
||||||
|
"""
|
||||||
|
将Post对象转换为字典形式
|
||||||
|
"""
|
||||||
|
return {k: v if not isinstance(v, Enum) else v.value for k, v in vars(self).items()}
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Custom:
|
||||||
|
"""
|
||||||
|
自定义数据模型
|
||||||
|
- yiyan:一言
|
||||||
|
"""
|
||||||
|
yiyan: str
|
||||||
|
|
||||||
|
def to_dict(self) -> dict:
|
||||||
|
"""
|
||||||
|
将Custom对象转换为字典形式
|
||||||
|
"""
|
||||||
|
return vars(self)
|
219
src/scraper.py
@ -1,85 +1,139 @@
|
|||||||
|
import re
|
||||||
|
from collections import Counter
|
||||||
|
|
||||||
import feedparser
|
import feedparser
|
||||||
from loguru import logger
|
from loguru import logger
|
||||||
|
|
||||||
|
from . import analyzer
|
||||||
|
from . import const
|
||||||
from . import tools
|
from . import tools
|
||||||
|
|
||||||
|
|
||||||
class Blog:
|
class Blog:
|
||||||
def __init__(self, url):
|
def __init__(self, rss):
|
||||||
try:
|
try:
|
||||||
self.feed = feedparser.parse(url)
|
# 解析RSS feed
|
||||||
|
self._feed = feedparser.parse(rss)
|
||||||
|
# 解析feed中的所有文章
|
||||||
|
self._posts = [Post(entry) for entry in self._feed.entries]
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f'解析 RSS feed 时发生错误: {str(e)}')
|
logger.error(f'Feedparser 解析 RSS feed 时发生错误: {str(e)}')
|
||||||
raise
|
raise
|
||||||
self.posts = [Post(entry) for entry in self.feed.entries]
|
|
||||||
|
|
||||||
def _get_feed_field(self, field):
|
def _get_feed_field(self, field):
|
||||||
"""
|
if field_value := self._feed.feed.get(field):
|
||||||
从 RSS feed 中获取特定字段
|
return field_value
|
||||||
"""
|
logger.warning(f'Feedparser {field} 字段不存在!')
|
||||||
field_value = self.feed.feed.get(field)
|
return ""
|
||||||
if field_value is None:
|
|
||||||
logger.warning(f'{field} 字段不存在!')
|
|
||||||
return field_value
|
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def title(self):
|
def title(self):
|
||||||
return self._get_feed_field('title')
|
# 获取RSS feed的标题
|
||||||
|
return self._feed.feed.get('title')
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def link(self):
|
def link(self):
|
||||||
return self._get_feed_field('link')
|
# 获取RSS feed的链接
|
||||||
|
return self._feed.feed.get('link')
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def life(self):
|
def life(self):
|
||||||
domain = tools.get_domain(self.link)
|
# 获取RSS feed链接的域名存活时间
|
||||||
return tools.get_domain_life(domain)
|
return tools.get_domain_life(self.link)
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def article_count(self):
|
def article_count(self):
|
||||||
return len(self.posts)
|
# 获取文章数量
|
||||||
|
return len(self._posts) if self._posts else 0
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def article_word_count(self):
|
def article_word_count(self):
|
||||||
return sum(post.word_count for post in self.posts)
|
# 获取文章总字数
|
||||||
|
return sum(post.word_count for post in self._posts) if self._posts else 0
|
||||||
|
|
||||||
|
@property
|
||||||
|
def keys(self):
|
||||||
|
if self._posts:
|
||||||
|
# 提取所有关键字
|
||||||
|
all_keys = [key for post in self._posts for key in post.keys]
|
||||||
|
|
||||||
|
# 过滤出中文关键字
|
||||||
|
chinese_keys = [key for key in all_keys if re.search(r'[\u4e00-\u9fff]+', key)]
|
||||||
|
|
||||||
|
# 计算关键字出现的次数
|
||||||
|
keyword_counts = Counter(chinese_keys)
|
||||||
|
|
||||||
|
# 提取出现次数最多的关键字
|
||||||
|
top_keywords = keyword_counts.most_common(const.BLOG_MAX_KEYS)
|
||||||
|
|
||||||
|
return top_keywords
|
||||||
|
|
||||||
|
return []
|
||||||
|
|
||||||
|
@property
|
||||||
|
def category(self):
|
||||||
|
# 获取博客的分类
|
||||||
|
if self._posts:
|
||||||
|
# 如果博客有帖子
|
||||||
|
categories = [post.category for post in self._posts]
|
||||||
|
# 获取所有帖子的分类
|
||||||
|
cat_counts = Counter(categories)
|
||||||
|
# 统计每个分类的个数
|
||||||
|
most_common_cat = cat_counts.most_common(1)[0][0]
|
||||||
|
# 获取出现次数最多的分类
|
||||||
|
return most_common_cat
|
||||||
|
# 如果博客没有帖子
|
||||||
|
return const.BLOG_POST_CATEGORY_LIFE
|
||||||
|
|
||||||
|
@property
|
||||||
def post_lists(self):
|
def post_lists(self):
|
||||||
return self.posts
|
# 获取文章列表
|
||||||
|
return self._posts if self._posts else []
|
||||||
|
|
||||||
def __str__(self):
|
def __str__(self):
|
||||||
return f"Blog: {self.title}, Life:{self.life}, Count{self.article_count}. Word count:{self.article_word_count}"
|
return f"""
|
||||||
|
博客: {self.title}
|
||||||
|
链接: {self.link}
|
||||||
|
时间: {self.life} 天
|
||||||
|
文章: {self.article_count} 篇
|
||||||
|
字数: {self.article_word_count} 个
|
||||||
|
分类: {self.category}
|
||||||
|
关键字: {self.keys}
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
class Post:
|
class Post:
|
||||||
def __init__(self, entry):
|
def __init__(self, entry):
|
||||||
# 日期权重
|
|
||||||
self._weight = None
|
|
||||||
# 日子
|
|
||||||
self._date = None
|
|
||||||
# 情感分
|
|
||||||
self._score = None
|
|
||||||
# 关键字
|
|
||||||
self._keys = None
|
|
||||||
# 分类
|
|
||||||
self._category = None
|
|
||||||
self.entry = entry
|
self.entry = entry
|
||||||
|
# 文章内容
|
||||||
|
self._content = self._get_content()
|
||||||
|
# 文章时间
|
||||||
|
self._time = tools.format_datetime(self._get_entry_field('published'))
|
||||||
|
# 文章日期
|
||||||
|
self._date = analyzer.special_date_calculation(self._time)
|
||||||
|
# 特殊日期分
|
||||||
|
self._special_date_score = analyzer.calculate_weight(self._get_entry_field('published'))
|
||||||
|
# 关键字
|
||||||
|
self._keys = analyzer.extract_keywords(text=self._content,
|
||||||
|
topK=tools.get_multiple_of_100(self._content),
|
||||||
|
stopwords='data/stop_words.txt')
|
||||||
|
# 文章情感分
|
||||||
|
self._sentiment_score = analyzer.analyze_sentiment(self._keys)
|
||||||
|
# 分类
|
||||||
|
self._category = analyzer.check_category(tech_terms_file='data/tech_terms.txt', keywords=self._keys)
|
||||||
|
|
||||||
def _get_entry_field(self, field):
|
def _get_entry_field(self, field):
|
||||||
"""
|
return self.entry.get(field)
|
||||||
从 RSS entry 中获取特定字段
|
|
||||||
"""
|
|
||||||
field_value = self.entry.get(field)
|
|
||||||
if field_value is None:
|
|
||||||
pass
|
|
||||||
# logger.warning(f'{field} 字段不存在!')
|
|
||||||
return field_value
|
|
||||||
|
|
||||||
@property
|
def _get_content(self):
|
||||||
def title(self):
|
"""
|
||||||
return self._get_entry_field('title')
|
获取文章内容。
|
||||||
|
:return: 文章的描述或内容,根据以下规则:
|
||||||
@property
|
- 如果'content'字段存在,那么返回'content'字段的值。
|
||||||
def content(self):
|
- 如果'description'字段的长度小于128,并且'content'字段存在,那么返回'content'字段的值。
|
||||||
|
- 否则,返回'description'字段的值。
|
||||||
|
- 如果'description'和'content'字段都不存在,返回空字符串。
|
||||||
|
"""
|
||||||
description = self._get_entry_field('description')
|
description = self._get_entry_field('description')
|
||||||
content = self._get_entry_field('content')
|
content = self._get_entry_field('content')
|
||||||
if content:
|
if content:
|
||||||
@ -94,60 +148,61 @@ class Post:
|
|||||||
return description
|
return description
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def time(self):
|
def title(self):
|
||||||
return self._get_entry_field('published')
|
# 获取文章标题
|
||||||
|
return self._get_entry_field('title')
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def link(self):
|
def content(self):
|
||||||
return self._get_entry_field('link')
|
# 获取文章内容
|
||||||
|
return self._content
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def word_count(self):
|
def word_count(self):
|
||||||
|
# 获取文章字数
|
||||||
return len(self.content) if self.content else 0
|
return len(self.content) if self.content else 0
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def keys(self):
|
def time(self):
|
||||||
return self._keys
|
# 获取文章时间
|
||||||
|
return self._time
|
||||||
@keys.setter
|
|
||||||
def keys(self, value):
|
|
||||||
self._keys = value
|
|
||||||
|
|
||||||
@property
|
|
||||||
def score(self):
|
|
||||||
return self._score
|
|
||||||
|
|
||||||
@score.setter
|
|
||||||
def score(self, value):
|
|
||||||
self._score = value
|
|
||||||
|
|
||||||
@property
|
|
||||||
def category(self):
|
|
||||||
return self._category
|
|
||||||
|
|
||||||
@category.setter
|
|
||||||
def category(self, value):
|
|
||||||
self._category = value
|
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def date(self):
|
def date(self):
|
||||||
|
# 获取日期分
|
||||||
return self._date
|
return self._date
|
||||||
|
|
||||||
@date.setter
|
@property
|
||||||
def date(self, value):
|
def link(self):
|
||||||
self._date = value
|
# 获取文章链接
|
||||||
|
return self._get_entry_field('link')
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def weight(self):
|
def keys(self):
|
||||||
return self._weight
|
# 获取文章关键字
|
||||||
|
return self._keys
|
||||||
|
|
||||||
@weight.setter
|
@property
|
||||||
def weight(self, value):
|
def category(self):
|
||||||
self._weight = value
|
# 获取文章分类
|
||||||
|
return self._category
|
||||||
|
|
||||||
|
@property
|
||||||
|
def special_date_score(self):
|
||||||
|
# 获取特殊日期分
|
||||||
|
return self._special_date_score
|
||||||
|
|
||||||
|
@property
|
||||||
|
def sentiment_score(self):
|
||||||
|
# 获取文章情感分
|
||||||
|
return self._sentiment_score
|
||||||
|
|
||||||
def __str__(self):
|
def __str__(self):
|
||||||
return (f"Post title={self.title[:20]}..., "
|
return (f" 标题:{self.title}, "
|
||||||
f" content={self.content[:20]}..., "
|
f" 内容:{self.content[:20]}..., "
|
||||||
f" time={self.time}, "
|
f" 时间:{self.time}, "
|
||||||
f" link={self.link}, "
|
f" 链接:{self.link}, "
|
||||||
f" word_count={self.word_count}")
|
f" 日期分:{self.special_date_score}"
|
||||||
|
f" 情感分:{self.sentiment_score}"
|
||||||
|
f" 类目:{self.category}"
|
||||||
|
f" 关键字:{self.keys}")
|
||||||
|
65
src/tools.py
@ -1,10 +1,14 @@
|
|||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
from urllib.parse import urlparse
|
from urllib.parse import urlparse
|
||||||
|
|
||||||
|
import pytz
|
||||||
import requests
|
import requests
|
||||||
from bs4 import BeautifulSoup
|
from bs4 import BeautifulSoup
|
||||||
|
from dateutil.parser import parse
|
||||||
from loguru import logger
|
from loguru import logger
|
||||||
|
|
||||||
|
from . import const
|
||||||
|
|
||||||
|
|
||||||
def check_website_status(url):
|
def check_website_status(url):
|
||||||
"""
|
"""
|
||||||
@ -13,14 +17,14 @@ def check_website_status(url):
|
|||||||
:return:True 可以访问,False 不可以。
|
:return:True 可以访问,False 不可以。
|
||||||
"""
|
"""
|
||||||
try:
|
try:
|
||||||
response = requests.get(url, timeout=5) # Set timeout to 5 seconds
|
response = requests.get(url, timeout=30) # Set timeout to 5 seconds
|
||||||
if response.status_code == 200:
|
if response.status_code == 200:
|
||||||
return True
|
return True
|
||||||
else:
|
else:
|
||||||
logger.error(f"{url} 网站无法访问,状态码:{response.status_code}")
|
logger.error(f"{url} 网站无法访问,状态码:{response.status_code}")
|
||||||
return False
|
return False
|
||||||
except requests.Timeout as e:
|
except requests.Timeout as e:
|
||||||
logger.error(f"{url} 请求超时,错误:{e}")
|
logger.error(f"{url} 请求超时 30 秒,错误:{e}")
|
||||||
return False
|
return False
|
||||||
except requests.ConnectionError as e:
|
except requests.ConnectionError as e:
|
||||||
logger.error(f"{url} 连接错误,错误:{e}")
|
logger.error(f"{url} 连接错误,错误:{e}")
|
||||||
@ -54,10 +58,10 @@ def get_domain_life(url):
|
|||||||
headers = {
|
headers = {
|
||||||
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"
|
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"
|
||||||
}
|
}
|
||||||
domain_url = f"https://rdap.verisign.com/com/v1/domain/{url}"
|
domain = get_domain(url)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
response = requests.get(domain_url, headers=headers)
|
response = requests.get(f"https://rdap.verisign.com/com/v1/domain/{domain}", headers=headers, timeout=30)
|
||||||
response.raise_for_status() # Raises stored HTTPError, if one occurred.
|
response.raise_for_status() # Raises stored HTTPError, if one occurred.
|
||||||
|
|
||||||
registration_date = response.json().get('events')[0].get('eventDate')
|
registration_date = response.json().get('events')[0].get('eventDate')
|
||||||
@ -87,7 +91,7 @@ def get_domain_life(url):
|
|||||||
except Exception as err:
|
except Exception as err:
|
||||||
logger.error(f"未预期的错误: {err}")
|
logger.error(f"未预期的错误: {err}")
|
||||||
|
|
||||||
return None
|
return 0
|
||||||
|
|
||||||
|
|
||||||
def remove_html_tags(text):
|
def remove_html_tags(text):
|
||||||
@ -97,3 +101,54 @@ def remove_html_tags(text):
|
|||||||
:return:文本
|
:return:文本
|
||||||
"""
|
"""
|
||||||
return BeautifulSoup(text, "html.parser").get_text()
|
return BeautifulSoup(text, "html.parser").get_text()
|
||||||
|
|
||||||
|
|
||||||
|
def get_yiyan():
|
||||||
|
"""
|
||||||
|
获取一言文学语句
|
||||||
|
:return:一言
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
response = requests.get("https://v1.hitokoto.cn/?c=d&min_length=16&max_length=20&encode=text",
|
||||||
|
timeout=30) # Set timeout to 5 seconds
|
||||||
|
if response.status_code == 200:
|
||||||
|
return response.text
|
||||||
|
else:
|
||||||
|
logger.error(f"一言网站无法访问,状态码:{response.status_code}")
|
||||||
|
return False
|
||||||
|
except requests.Timeout as e:
|
||||||
|
logger.error(f"一言请求超时 30 秒,错误:{e}")
|
||||||
|
return False
|
||||||
|
except requests.ConnectionError as e:
|
||||||
|
logger.error(f"一言连接错误,错误:{e}")
|
||||||
|
return False
|
||||||
|
except requests.RequestException as e:
|
||||||
|
logger.error(f"一言网站无法访问,错误:{e}")
|
||||||
|
return False
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"一言未知错误,错误:{e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def get_multiple_of_100(string):
|
||||||
|
"""
|
||||||
|
获取文章长度 100 的整除
|
||||||
|
:return:建议关键字数量
|
||||||
|
"""
|
||||||
|
length = len(string)
|
||||||
|
multiple = length // 100
|
||||||
|
if multiple < 1:
|
||||||
|
multiple = 1
|
||||||
|
return multiple
|
||||||
|
|
||||||
|
|
||||||
|
def format_datetime(dt_str):
|
||||||
|
"""
|
||||||
|
格式化时间字符串为指定格式
|
||||||
|
:param dt_str:时间字符串
|
||||||
|
:return:指定格式
|
||||||
|
"""
|
||||||
|
dt = parse(dt_str)
|
||||||
|
tz = pytz.timezone(const.TIME_ZONE)
|
||||||
|
formatted_dt = dt.astimezone(tz).strftime(const.FORMAT_TIME)
|
||||||
|
return formatted_dt
|
||||||
|
@ -3,6 +3,11 @@
|
|||||||
src: url('../font/LXGWWenKaiMonoGB-Bold.ttf') format('truetype');
|
src: url('../font/LXGWWenKaiMonoGB-Bold.ttf') format('truetype');
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@font-face {
|
||||||
|
font-family: 'dongzhu';
|
||||||
|
src: url('../font/dongzhu-Extralight.ttf') format('truetype');
|
||||||
|
}
|
||||||
|
|
||||||
@media screen and (max-width: 320px) {
|
@media screen and (max-width: 320px) {
|
||||||
html {
|
html {
|
||||||
font-size: 15px;
|
font-size: 15px;
|
||||||
@ -76,6 +81,48 @@ html {
|
|||||||
|
|
||||||
#tab1 {
|
#tab1 {
|
||||||
background-image: url('../img/page1.jpg');
|
background-image: url('../img/page1.jpg');
|
||||||
|
position: relative;
|
||||||
|
}
|
||||||
|
|
||||||
|
.popup {
|
||||||
|
position: absolute;
|
||||||
|
margin: 0 2px;
|
||||||
|
bottom: 0.5rem;
|
||||||
|
background: #fff;
|
||||||
|
padding: 1.5rem;
|
||||||
|
border: 1px solid #d1d1d1;
|
||||||
|
}
|
||||||
|
|
||||||
|
.popup > .notice > hr {
|
||||||
|
margin: 1rem 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
.popup > .selection {
|
||||||
|
text-align: right;
|
||||||
|
}
|
||||||
|
|
||||||
|
.selection > button {
|
||||||
|
border: none;
|
||||||
|
padding: 0.75rem 1rem;
|
||||||
|
margin: 0.5rem 0 0 0.5rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.allow {
|
||||||
|
cursor: pointer;
|
||||||
|
}
|
||||||
|
|
||||||
|
.deny {
|
||||||
|
background-color: #059862;
|
||||||
|
cursor: pointer;
|
||||||
|
color: #fff;
|
||||||
|
padding: 0.5rem 0;
|
||||||
|
opacity: 0.8;
|
||||||
|
}
|
||||||
|
|
||||||
|
.deny:hover,
|
||||||
|
.deny:focus {
|
||||||
|
opacity: 1;
|
||||||
|
cursor: pointer;
|
||||||
}
|
}
|
||||||
|
|
||||||
#tab2 {
|
#tab2 {
|
||||||
@ -84,8 +131,8 @@ html {
|
|||||||
|
|
||||||
.tab2-box {
|
.tab2-box {
|
||||||
position: fixed;
|
position: fixed;
|
||||||
top: 1rem;
|
top: 2rem;
|
||||||
padding: 1rem 1rem;
|
padding-left: 2rem;
|
||||||
font-size: 1.75rem;
|
font-size: 1.75rem;
|
||||||
color: #FFFFFFD9;
|
color: #FFFFFFD9;
|
||||||
}
|
}
|
||||||
@ -101,8 +148,8 @@ html {
|
|||||||
|
|
||||||
.tab3-box {
|
.tab3-box {
|
||||||
position: fixed;
|
position: fixed;
|
||||||
top: 1rem;
|
top: 2rem;
|
||||||
padding: 1rem 1rem;
|
padding-left: 2rem;
|
||||||
font-size: 1.75rem;
|
font-size: 1.75rem;
|
||||||
color: #ffa940;
|
color: #ffa940;
|
||||||
}
|
}
|
||||||
@ -117,20 +164,34 @@ html {
|
|||||||
}
|
}
|
||||||
|
|
||||||
.tab4-box {
|
.tab4-box {
|
||||||
writing-mode: vertical-lr;
|
position: relative;;
|
||||||
position: fixed;
|
|
||||||
top: 2rem;
|
top: 2rem;
|
||||||
font-size: 1.75rem;
|
left: 2rem;
|
||||||
color: #FFFFFFD9;
|
line-height: 2rem;
|
||||||
|
color: #448288D9;
|
||||||
}
|
}
|
||||||
|
|
||||||
#tab4 > .tab4-box > p:last-child {
|
#tab4 > .tab4-box > p:nth-child(1) {
|
||||||
writing-mode: horizontal-tb;
|
width: 7rem;
|
||||||
margin: 0 0;
|
font-size: 2.5rem;
|
||||||
height: 30vh;
|
}
|
||||||
width: 13rem;
|
|
||||||
|
#tab4 > .tab4-box > p:nth-child(2) {
|
||||||
|
font-size: 1.5rem;
|
||||||
|
width: 16rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
#tab4 > .tab4-box > p:nth-child(3) {
|
||||||
font-size: 1.25rem;
|
font-size: 1.25rem;
|
||||||
line-height: 1.75rem;
|
width: 14rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
#tab4 > .tab4-box > p:nth-child(4) {
|
||||||
|
font-size: 2rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
#tab4 > .tab4-box > p:nth-child(5) {
|
||||||
|
font-size: 2rem;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -139,13 +200,59 @@ html {
|
|||||||
}
|
}
|
||||||
|
|
||||||
.tab5-box {
|
.tab5-box {
|
||||||
|
writing-mode: vertical-lr;
|
||||||
|
position: fixed;
|
||||||
|
top: 2rem;
|
||||||
|
font-size: 1.75rem;
|
||||||
|
color: #FFFFFFD9;
|
||||||
|
}
|
||||||
|
|
||||||
|
#tab5 > .tab5-box > p:last-child {
|
||||||
|
writing-mode: horizontal-tb;
|
||||||
|
margin: 0 0;
|
||||||
|
height: 30vh;
|
||||||
|
width: 13rem;
|
||||||
|
font-size: 1.25rem;
|
||||||
|
line-height: 1.75rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
#tab6 {
|
||||||
|
background-image: url('../img/page6.jpg');
|
||||||
|
}
|
||||||
|
|
||||||
|
.tab6-box {
|
||||||
|
position: relative;
|
||||||
|
padding: 1rem;
|
||||||
|
text-align: center;
|
||||||
|
line-height: 2rem;
|
||||||
|
color: #613f38c7;
|
||||||
|
}
|
||||||
|
|
||||||
|
#tab6 > .tab6-box > p:nth-child(1) {
|
||||||
|
font-size: 2rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
#tab6 > .tab6-box > p:nth-child(2) {
|
||||||
|
font-size: 1.25rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
#tab6 > .tab6-box > p:nth-child(3) {
|
||||||
|
font-size: 1.5rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
#tab7 {
|
||||||
|
background-image: url('../img/page7.jpg');
|
||||||
|
}
|
||||||
|
|
||||||
|
.tab7-box {
|
||||||
position: relative;
|
position: relative;
|
||||||
top: 2rem;
|
top: 2rem;
|
||||||
left: 0.75rem;
|
left: 0.75rem;
|
||||||
color: #597ef7;
|
color: #597ef7;
|
||||||
}
|
}
|
||||||
|
|
||||||
#tab5 > .tab5-box > p:nth-child(1) {
|
#tab7 > .tab7-box > p:nth-child(1) {
|
||||||
position: relative;
|
position: relative;
|
||||||
top: -6rem;
|
top: -6rem;
|
||||||
left: 0.8rem;
|
left: 0.8rem;
|
||||||
@ -153,7 +260,7 @@ html {
|
|||||||
font-size: 1.75rem;
|
font-size: 1.75rem;
|
||||||
}
|
}
|
||||||
|
|
||||||
#tab5 > .tab5-box > p:nth-child(2) {
|
#tab7 > .tab7-box > p:nth-child(2) {
|
||||||
position: relative;
|
position: relative;
|
||||||
top: 3.2rem;
|
top: 3.2rem;
|
||||||
left: 7.8rem;
|
left: 7.8rem;
|
||||||
@ -161,7 +268,7 @@ html {
|
|||||||
font-size: 3.5rem;
|
font-size: 3.5rem;
|
||||||
}
|
}
|
||||||
|
|
||||||
#tab5 > .tab5-box > p:nth-child(3) {
|
#tab7 > .tab7-box > p:nth-child(3) {
|
||||||
position: relative;
|
position: relative;
|
||||||
top: 8rem;
|
top: 8rem;
|
||||||
left: 2rem;
|
left: 2rem;
|
||||||
@ -169,7 +276,7 @@ html {
|
|||||||
font-size: 2.5rem;
|
font-size: 2.5rem;
|
||||||
}
|
}
|
||||||
|
|
||||||
#tab5 > .tab5-box > p:nth-child(4) {
|
#tab7 > .tab7-box > p:nth-child(4) {
|
||||||
position: relative;
|
position: relative;
|
||||||
top: -13rem;
|
top: -13rem;
|
||||||
left: 11.7rem;
|
left: 11.7rem;
|
||||||
@ -177,7 +284,7 @@ html {
|
|||||||
font-size: 1.75rem;
|
font-size: 1.75rem;
|
||||||
}
|
}
|
||||||
|
|
||||||
#tab5 > .tab5-box > p:nth-child(5) {
|
#tab7 > .tab7-box > p:nth-child(5) {
|
||||||
position: relative;
|
position: relative;
|
||||||
top: -5.6rem;
|
top: -5.6rem;
|
||||||
left: 9.7rem;
|
left: 9.7rem;
|
||||||
@ -185,43 +292,59 @@ html {
|
|||||||
font-size: 2rem;
|
font-size: 2rem;
|
||||||
}
|
}
|
||||||
|
|
||||||
#tab5 > .tab5-box > p:nth-child(6) {
|
#tab7 > .tab7-box > p:nth-child(6) {
|
||||||
|
position: relative;
|
||||||
|
top: 3rem;
|
||||||
|
left: 2rem;
|
||||||
|
font-size: 1.5rem;
|
||||||
|
line-height: 3rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
#tab7 > .tab7-box > p:nth-child(6) > small {
|
||||||
|
font-size: 3rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
#tab7 > .tab7-box > p:nth-child(7) {
|
||||||
position: relative;
|
position: relative;
|
||||||
top: 2rem;
|
top: 2rem;
|
||||||
left: 2rem;
|
left: 2rem;
|
||||||
font-size: 1.5rem;
|
font-size: 1.5rem;
|
||||||
|
line-height: 3rem;
|
||||||
}
|
}
|
||||||
|
|
||||||
#tab5 > .tab5-box > p:nth-child(6) > small {
|
#tab7 > .tab7-box > p:nth-child(7) > small {
|
||||||
font-size: 3rem;
|
|
||||||
}
|
|
||||||
|
|
||||||
#tab5 > .tab5-box > p:nth-child(7) {
|
|
||||||
position: relative;
|
|
||||||
top: 2rem;
|
|
||||||
left: 2rem;
|
|
||||||
font-size: 1.5rem;
|
|
||||||
line-height: 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
#tab5 > .tab5-box > p:nth-child(7) > small {
|
|
||||||
font-size: 3rem;
|
font-size: 3rem;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
#tab6 {
|
#tab8 {
|
||||||
background-image: url('../img/page6.jpg');
|
background-image: url('../img/page8.jpg');
|
||||||
}
|
}
|
||||||
|
|
||||||
#tab6 > .tab6-box {
|
#tab8 > .tab8-box {
|
||||||
position: fixed;
|
position: fixed;
|
||||||
bottom: 7%;
|
bottom: 7%;
|
||||||
padding-left: 2rem;
|
padding-left: 2rem;
|
||||||
font-size: 1.75rem;
|
font-size:1.75rem;
|
||||||
color: #FFFFFFD9;
|
color: #FFFFFFD9;
|
||||||
}
|
}
|
||||||
|
|
||||||
#tab7 {
|
#tab9 {
|
||||||
background-image: url('../img/page7.jpg');
|
background-image: url('../img/page9.jpg');
|
||||||
|
}
|
||||||
|
|
||||||
|
#tab9 > .tab9-box {
|
||||||
|
padding: 2rem 2rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.tab9-box > p {
|
||||||
|
font-family: 'dongzhu', -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial,
|
||||||
|
'Noto Sans', sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol',
|
||||||
|
'Noto Color Emoji';
|
||||||
|
font-size: 2.75rem;
|
||||||
|
line-height: 3.5rem;
|
||||||
|
writing-mode: vertical-rl;
|
||||||
|
color: #FFFFFFD9;
|
||||||
|
margin: 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
BIN
static/painting/font/dongzhu-Extralight.ttf
Normal file
Before Width: | Height: | Size: 332 KiB After Width: | Height: | Size: 258 KiB |
Before Width: | Height: | Size: 418 KiB After Width: | Height: | Size: 332 KiB |
Before Width: | Height: | Size: 448 KiB After Width: | Height: | Size: 507 KiB |
Before Width: | Height: | Size: 490 KiB After Width: | Height: | Size: 418 KiB |
BIN
static/painting/img/page8.jpg
Normal file
After Width: | Height: | Size: 448 KiB |
BIN
static/painting/img/page9.jpg
Normal file
After Width: | Height: | Size: 490 KiB |
BIN
static/painting/music/bgm.mp3
Normal file
@ -3,30 +3,58 @@
|
|||||||
<head>
|
<head>
|
||||||
<meta charset="UTF-8">
|
<meta charset="UTF-8">
|
||||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||||
<title>EndOfYear</title>
|
<title>{{ site.title }}</title>
|
||||||
{% if web_status == 1 %}
|
{% if site.service == 1 %}
|
||||||
<link rel="stylesheet" href="{{ url_for('static', filename='painting/css/normalize.css') }}">
|
<link rel="stylesheet" href="{{ url_for('static', filename='painting/css/normalize.css') }}">
|
||||||
<link rel="stylesheet" href="{{ url_for('static', filename='painting/css/animate.min.css') }}">
|
<link rel="stylesheet" href="{{ url_for('static', filename='painting/css/animate.min.css') }}">
|
||||||
<link rel="stylesheet" href="{{ url_for('static', filename='painting/css/painting.css') }}">
|
<link rel="stylesheet" href="{{ url_for('static', filename='painting/css/painting.css') }}">
|
||||||
{% else %}
|
{% else %}
|
||||||
<link rel="stylesheet" href="painting/css/normalize.css">
|
<link rel="stylesheet" href="painting/css/normalize.css">
|
||||||
<link rel="stylesheet" href="painting/css/animate.min.css">
|
<link rel="stylesheet" href="painting/css/animate.min.css">
|
||||||
<link rel="stylesheet" href="painting/css/painting.css">
|
<link rel="stylesheet" href="painting/css/painting.css">
|
||||||
{% endif %}
|
{% endif %}
|
||||||
<script async src="https://umami.7wate.org/script.js" data-website-id="635fec50-bc6c-4ac2-909a-e2a7403438be"></script>
|
<script async src="https://umami.7wate.org/script.js"
|
||||||
|
data-website-id="635fec50-bc6c-4ac2-909a-e2a7403438be"></script>
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<div class="container active animate__animated animate__fadeIn animate__slow" id="tab1"></div>
|
<div class="container active animate__animated animate__fadeIn animate__slow" id="tab1">
|
||||||
|
<audio id="bgm" loop>
|
||||||
|
{% if site.service == 1 %}
|
||||||
|
<source src="{{ url_for('static', filename='painting/music/bgm.mp3') }}" type="audio/mpeg">Your browser does
|
||||||
|
not support the audio element.
|
||||||
|
{% else %}
|
||||||
|
<source src="painting/music/bgm.mp3" type="audio/mpeg">Your browser does not support the audio element.
|
||||||
|
{% endif %}
|
||||||
|
|
||||||
|
</audio>
|
||||||
|
<div class="popup">
|
||||||
|
<div class="notice">
|
||||||
|
<h4>温馨提示</h4>
|
||||||
|
<hr>
|
||||||
|
<p>EndofYear 使用互联网公开的 RSS 数据源,并使用自建的 Umami 服务统计访问量,绝对不会主动获取个人隐私信息。🫣🫣🫣
|
||||||
|
<br>
|
||||||
|
<br>
|
||||||
|
开启方式:小手轻轻点 ~
|
||||||
|
<a href="https://github.com/7Wate/EndOfYear"></a>
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
<div class="selection">
|
||||||
|
<button value="true" class="allow">开启音乐</button>
|
||||||
|
<button value="false" class="deny">静音进入</button>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
<div class="container animate__animated animate__fadeIn animate__slow" id="tab2">
|
<div class="container animate__animated animate__fadeIn animate__slow" id="tab2">
|
||||||
<div class="tab2-box">
|
<div class="tab2-box">
|
||||||
<p class="animate__animated animate__fadeIn animate__delay-1s">亲爱的{{ data.blog_name }}</p>
|
<p class="animate__animated animate__fadeIn animate__delay-1s">亲爱的{{ blog.name }}</p>
|
||||||
{% if data.blog_life == 0 %}
|
{% if blog.life == 0 %}
|
||||||
<p class="animate__animated animate__fadeIn animate__delay-2s">旧事如梦,一年已过</p>
|
<p class="animate__animated animate__fadeIn animate__delay-2s">旧事如梦,一年已过</p>
|
||||||
<p class="animate__animated animate__fadeIn animate__delay-2s">贰三年、感谢有你!</p>
|
<p class="animate__animated animate__fadeIn animate__delay-2s">贰三年、感谢有你!</p>
|
||||||
{% else %}
|
{% else %}
|
||||||
<p class="animate__animated animate__fadeIn animate__delay-2s">今天是我们相识的</p>
|
<p class="animate__animated animate__fadeIn animate__delay-2s">今天是我们相识的</p>
|
||||||
<p class="animate__animated animate__fadeIn animate__delay-3s">第 <small>{{ data.blog_life_year }}</small> 年
|
<p class="animate__animated animate__fadeIn animate__delay-3s">第 <small>{{ blog.life // 365 }}</small> 年
|
||||||
<small>{{ data.blog_life_day }}</small> 天</p>
|
<small>{{ blog.life % 365 }}</small> 天</p>
|
||||||
{% endif %}
|
{% endif %}
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
@ -34,47 +62,68 @@
|
|||||||
<div class="container animate__animated animate__fadeIn animate__slow" id="tab3">
|
<div class="container animate__animated animate__fadeIn animate__slow" id="tab3">
|
||||||
<div class="tab3-box">
|
<div class="tab3-box">
|
||||||
<p class="animate__animated animate__fadeInUp animate__delay-1s">这一年你写下了</p>
|
<p class="animate__animated animate__fadeInUp animate__delay-1s">这一年你写下了</p>
|
||||||
<p class="animate__animated animate__fadeInUp animate__delay-2s"><small>{{ data.blog_article_count }}</small>
|
<p class="animate__animated animate__fadeInUp animate__delay-2s"><small>{{ blog.article_count }}</small>
|
||||||
篇博文</p>
|
篇博文</p>
|
||||||
<p class="animate__animated animate__fadeInUp animate__delay-3s"><small>{{ data.blog_article_word_count
|
<p class="animate__animated animate__fadeInUp animate__delay-3s">
|
||||||
}}</small> 个文字</p>
|
<small>{{ blog.article_word_count }}</small> 个文字</p>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
<div class="container animate__animated animate__fadeIn animate__slow" id="tab4">
|
<div class="container animate__animated animate__fadeIn animate__slow" id="tab4">
|
||||||
<div class="tab4-box">
|
<div class="tab4-box">
|
||||||
<p class="animate__animated animate__fadeInDown animate__delay-1s">{{ data.blog_content_date }}那天,你写下了</p>
|
<p class="animate__animated animate__fadeInLeft animate__delay-1s"> 其中 </p>
|
||||||
<p class="animate__animated animate__fadeInDown animate__delay-2s">{{ data.blog_title }}</p>
|
<p class="animate__animated animate__fadeInLeft animate__delay-2s"> {{ long_post.title }} </p>
|
||||||
<p class="animate__animated animate__fadeInDown animate__delay-3s">{{ data.blog_content }}<small>……</small>
|
<p class="animate__animated animate__fadeInLeft animate__delay-3s"> {{ long_post.content[:50] }}</p>
|
||||||
</p>
|
<p class="animate__animated animate__fadeIn animate__delay-4s"> 长似一江春水</p>
|
||||||
|
<p class="animate__animated animate__fadeIn animate__delay-5s"> 永不止息 ~ </p>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
<div class="container animate__animated animate__fadeIn animate__slow" id="tab5">
|
<div class="container animate__animated animate__fadeIn animate__slow" id="tab5">
|
||||||
<div class="tab5-box">
|
<div class="tab5-box">
|
||||||
{% for keyword in data.blog_top_keywords %}
|
<p class="animate__animated animate__fadeInDown animate__delay-1s">在{{ special_post.date }}那天,写下了</p>
|
||||||
<p>{{ keyword[0] }}</p>
|
<p class="animate__animated animate__fadeInDown animate__delay-2s">{{ special_post.title }}</p>
|
||||||
|
<p class="animate__animated animate__fadeInDown animate__delay-3s">{{ special_post.content[:50] }}<small>……</small>
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="container animate__animated animate__fadeIn animate__slow" id="tab6">
|
||||||
|
<div class="tab6-box">
|
||||||
|
<p class="animate__animated animate__fadeInUp animate__delay-1s">{{ sentiment_post.title }}</p>
|
||||||
|
<p class="animate__animated animate__fadeInUp animate__delay-2s">{{ sentiment_post.content[:50] }}</p>
|
||||||
|
<p class="animate__animated animate__fadeIn animate__delay-3s">如初春的暖阳,宁静而怡人</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="container animate__animated animate__fadeIn animate__slow" id="tab7">
|
||||||
|
<div class="tab7-box">
|
||||||
|
{% for keyword in blog.top_post_keys[0:5] %}
|
||||||
|
<p class="animate__animated animate__fadeIn animate__delay-1s">{{ keyword[0] }}</p>
|
||||||
{% endfor %}
|
{% endfor %}
|
||||||
<p class="animate__animated animate__fadeInDown animate__delay-1s">这些都是<small>你</small>的</p>
|
<p class="animate__animated animate__fadeInDown animate__delay-1s">这些都是<small>你</small>的</p>
|
||||||
<p class="animate__animated animate__fadeInDown animate__delay-2s">专属<small>关键词</small></p>
|
<p class="animate__animated animate__fadeInDown animate__delay-2s">专属<small>关键词</small></p>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
<div class="container animate__animated animate__fadeIn animate__slow" id="tab6">
|
<div class="container animate__animated animate__fadeIn animate__slow" id="tab8">
|
||||||
<div class="tab6-box">
|
<div class="tab8-box">
|
||||||
<p class="animate__animated animate__fadeInLeft animate__delay-1s">热爱{{ data.blog_category }}的你</p>
|
<p class="animate__animated animate__fadeInLeft animate__delay-1s">
|
||||||
|
热爱{% if blog.category == 1 %}生活{% else %}技术{% endif %}的你
|
||||||
|
</p>
|
||||||
<p class="animate__animated animate__fadeInLeft animate__delay-2s">一定要继续砥砺前行!</p>
|
<p class="animate__animated animate__fadeInLeft animate__delay-2s">一定要继续砥砺前行!</p>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
<div class="container animate__animated animate__fadeIn animate__slow" id="tab7">
|
<div class="container animate__animated animate__fadeIn animate__slow" id="tab9">
|
||||||
<div class="tab7-box">
|
<div class="tab9-box">
|
||||||
|
<p id="yiyan"> {{ custom.yiyan }}</p>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<script>
|
<script>
|
||||||
|
// 屏幕尺寸
|
||||||
console.log("Window inner size: ", window.innerWidth, "x", window.innerHeight);
|
console.log("Window inner size: ", window.innerWidth, "x", window.innerHeight);
|
||||||
console.log("Window outer size: ", window.outerWidth, "x", window.outerHeight);
|
console.log("Window outer size: ", window.outerWidth, "x", window.outerHeight);
|
||||||
console.log("Screen size: ", screen.width, "x", screen.height);
|
console.log("Screen size: ", screen.width, "x", screen.height);
|
||||||
console.log("Screen available size: ", screen.availWidth, "x", screen.availHeight);
|
console.log("Screen available size: ", screen.availWidth, "x", screen.availHeight);
|
||||||
var carousel = {
|
|
||||||
|
// 轮播切换
|
||||||
|
let carousel = {
|
||||||
currentIndex: 0,
|
currentIndex: 0,
|
||||||
tabs: [],
|
tabs: [],
|
||||||
|
|
||||||
@ -98,6 +147,30 @@
|
|||||||
window.onload = function () {
|
window.onload = function () {
|
||||||
carousel.init();
|
carousel.init();
|
||||||
};
|
};
|
||||||
|
|
||||||
|
// 弹出提示
|
||||||
|
let popup = document.querySelector('.popup');
|
||||||
|
let allowButton = document.querySelector('.allow');
|
||||||
|
let denyButton = document.querySelector('.deny');
|
||||||
|
let audioElement = document.getElementById('bgm');
|
||||||
|
|
||||||
|
allowButton.addEventListener('click', function (event) {
|
||||||
|
event.stopPropagation();
|
||||||
|
audioElement.play(); // 播放音乐
|
||||||
|
popup.style.display = 'none'; // 隐藏弹出窗口
|
||||||
|
});
|
||||||
|
|
||||||
|
denyButton.addEventListener('click', function (event) {
|
||||||
|
event.stopPropagation();
|
||||||
|
popup.style.display = 'none'; // 隐藏弹出窗口
|
||||||
|
});
|
||||||
|
|
||||||
|
// 博客最后的一言
|
||||||
|
window.addEventListener('DOMContentLoaded', (event) => {
|
||||||
|
let textElement = document.getElementById("yiyan");
|
||||||
|
let text = textElement.innerHTML;
|
||||||
|
textElement.innerHTML = text.replace(/[\u3000-\u303F\uFF01-\uFF0F\uFF1A-\uFF1E\uFF3B-\uFF3F\uFF5B-\uFF60\uFFE0-\uFFE6\u00D2]/g, "<br>");
|
||||||
|
});
|
||||||
</script>
|
</script>
|
||||||
|
|
||||||
</body>
|
</body>
|
||||||
|