forked from nadyka/BgGPT-Bulgarian_news_summarizer
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathBgGPT_requirements.txt
55 lines (39 loc) · 1.38 KB
/
BgGPT_requirements.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
PERVASIVE REQT: Preserve your data at each step
Start with 1 news source;
1) Retrieve the 10 most recent news articles.
2) FOR LOOP:
Feed an article into BgGPT (Mistral/LM Studio)
Ask BgGPT to summarize the article
Save the summary locally in text or HTML
3) Present the 10 news articles in Streamlit GUI
4) Retrieve at 7:00h UTC+3 and again in 12 hours at 19:00h UTC+3
5) Display summaries in Streamlit (in Text boxes, horizontal dividers) (as they are available?)
Article Record:
Title
Hyperlink to source article
Keywords
6) Collapsible text boxes with article title at the top.
Article link in collapsible text)
Full article text under the summary?
How and where will you store the articles? (SQL db or files?
SQlite database?
Index by multiple dimensions
News_source_date_time
Store articles in files
1 folder per batch of 10 articles and summaries
Assign a unique identifier based on the summarization time and the source
######################################
MVP 1.1 Improve user interface
MVP 1.2
Email: to readers
############################
MVP 1.3
Explore using langchain document loaders for HTML with images.
#############################
MVP v2
add more data sources
MVP v3
count popularity of topics (as reflected in clickbait titles)
MVP 4 or 5
Semantic Analysis:
Could analyze full articles to identify if they are covering the same articles.