-
Notifications
You must be signed in to change notification settings - Fork 19
Workflow
Today's News Online (TNO) is a news aggregation system that takes in news sources of varying types and provides a single location for clients to access the data. Provides a variety of key services to senior executives throughout government including near-live issue alerts, morning and evening summary reports, as well as in-depth media analysis of key government initiatives and issues.
The TNO solution processes have not yet been fully designed or implemented.
TNO aggregates news content from a variety of sources, organizes, and filters it to generate automated real time reports and alerts. Subscribers have a single location they can view all their news content.
All content is ingested into the solution through several different producers (listeners, requestors, readers, scanners, recorders, file shares, and editors). Producers are services that receive or fetch content from 3rd parties.
The content is then picked up and added to the Queue. Apache Kafka has been implemented to support the Queue. The Queue enables a performant process where all other dependent and independent systems can interact, which allows abstraction and separation of concerns. This enables horizontal and vertical scaling, along with enabling the ability to separate other services geographically. The Queue also enables automated management of the lifecycle and lifespan of content. Audio and Video content is never uploaded to the Queue, however the metadata is. Audio and Video content is instead maintained and provided through additional services that are physically closer to the content, which enables better performance and reduces network bandwidth.
Once content is in the Queue, consumers automatically pick it up and begin transcription, natural language processing, and indexing of the content and it's metadata. These processes and services enable searching, viewing, reporting, and analysis of the content and subscriber activities.
There are various ingestion services. Some are passively listening for pushed content (files uploaded to a share), while others are actively and constantly making requests to 3rd party sources for new content.
Editors are also able to manually add and published content.
Each service is a Kafka Producer which ensures all content events are pushed into the queue.
Service | Description |
---|---|
Syndication | Service that pulls syndication content from 3rd party APIs |
API Listener | Open RESTful API that 3rd parties can push content to |
API Requester | Service that pulls content from 3rd party APIs |
Web Reader | Service that crawls websites for content |
Recorder - Stream | Service that records streamed video |
Recorder - TV | Service that records TV |
Recorder = Radio | Service that records radio |
File Share Listener | Service that listens for file uploads to file shares |
Editor App | Web application that provides editors ability to manually add content |
A docker process will continually run based on configuration. It will have a local default configuration that it has to get started. It will make a request to the TNO DB for the latest configuration settings. To ensure duplicate entries are not pushed to Kafka it will maintain a reference in the TNO DB.
Each activity enables content to move through the TNO solution so that it can be maintained, published, transcribed, parsed, search, analyzed, archived, and at end of live purged.
Activity | Description |
---|---|
Store on File Share | Video and audio content is downloaded to file shares |
Upload to Media Service | Video and audio content is uploaded to cloud media service |
Queue | Kafka queue services to manage the distribution of content |
Transcribe | Kafka consumer process to extract text from video and audio content |
NLP Process | Kafka consumer process to perform natural language processing |
Index | Elasticsearch storage and indexing of content for the purpose of search |
Store Content | All metadata and content is stored within TNO for it's licensed lifecycle |
There are various services that are fully automated that generate reports, alerts, and archival and purging activities.
Service | Description |
---|---|
Reports | Generate reports based on content metadata, schedule, and subscribers |
Alerts | Generate alerts based on content metadata, schedule, and subscribers |
Archive/Clean | Ensure content licensing is adhered to, and configured storage limits |
The primary output of TNO is an aggregated source of 3rd party content. Subscribers are able to search and view content, along with receiving automated reports and alerts. TNO will also be able to monitor content and analyze subscriber activities in order to make informed decisions.
Feature | Description |
---|---|
Search | Filter and find content that is relevant and timely. This requires parsing content and generating relevant and accurate metadata through transcription, natural language processing, and indexing |
Content | Subscribers can view, listen, and read 3rd party content |
Reports | Subscribers received scheduled automatically generated content based on filters |
Alerts | Subscribers receive scheduled automatically generated content based on filters |
Monitor | Users can view and analysis content metadata and subscriber activities |
Users can manage the TNO solution through the below features.
Feature | Description |
---|---|
User Management | Administrators can assist users and their accounts. Users can manage their own profile preferences |
Report Management | Administrators can manage global reports. Users can create, manage and subscribe to reports |
Alert Management | Administrators can manage global alerts. Users can create, manage and subscribe to alerts |
Subscription Management | Administrators can manage user subscriptions. Users can manage their own subscription |