Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

per requested client fqdn/url metrics #91

Open
gberche-orange opened this issue Feb 21, 2024 · 3 comments
Open

per requested client fqdn/url metrics #91

gberche-orange opened this issue Feb 21, 2024 · 3 comments

Comments

@gberche-orange
Copy link

Thanks a lot for contributing this great work with the community and maintaining over time !

Describe the feature

I'm trying to get stats per requested FQDN (in the case of a CONNECT request) or Urls (in the case plain HTTP request) such as:

  • number of client requests
  • number of client kbytes received
  • number of client kbytes transferred
  • cache hits per url

Currently, I understand that available metrics don't have labels with fqdn or urls.

I'm not yet so familiar with what squid offers in terms of stats/metrics/reports that could be used. I did the following research in the documentation to learn a bit more below.

Is there prior work on this topic ?

Expected behavior

A new flag passed to the exporter to turn on a feature which adds metrics labels with client requested FQDN or Url

As to avoid prometheus cardinality explosion, the flag could select the k top FQDN/URL to surface as labels, and the group the long tail into an other category

Additional context
Add any other context about the problem here.

Research in squid documentation about per FQDN/Url stats/report available

A Cache Digest is a summary of the contents of an Internet Object Caching Server. It contains, in a compact (i.e. compressed) format, an indication of whether or not particular URLs are in the cache.

Enabling Cache Digests
If you wish to use Cache Digests (available in Squid version 2) you need to add a configure option, so that the relevant code is compiled in:
./configure --enable-cache-digests ...

the keys which are looked up in Cache Digests are actually formed by performing the MD5 [RFC 1321] digest function on the concatenation of:

  1. a numeric code for the HTTP method used, and
  2. the URL requested.
Squid report content

This is an example from a default build of Squid-3.2. Remember the menu varies with available features.

index Cache Manager Interface public
menu Cache Manager Menu public
offline_toggle Toggle offline_mode setting hidden
shutdown Shut Down the Squid Process hidden
reconfigure Reconfigure Squid hidden
rotate Rotate Squid Logs hidden
pconn Persistent Connection Utilization Histograms public
mem Memory Utilization public
diskd DISKD Stats public
squidaio_counts Async IO Function Counters public
config Current Squid Configuration hidden
comm_epoll_incoming comm_incoming() stats public
ipcache IP Cache Stats and Contents public
fqdncache FQDN Cache Stats and Contents public
idns Internal DNS Statistics public
redirector URL Redirector Stats public
external_acl External ACL stats public
http_headers HTTP Header Statistics public
info General Runtime Information public
service_times Service Times (Percentiles) public
filedescriptors Process Filedescriptor Allocation public
objects All Cache Objects public
vm_objects In-Memory and In-Transit Objects public
io Server-side network read() size histograms public
counters Traffic and Resource Counters public
peer_select Peer Selection Algorithms public
digest_stats Cache Digest and ICP blob public
5min 5 Minute Average of Counters public
60min 60 Minute Average of Counters public
utilization Cache Utilization public
histograms Full Histogram Counts public
active_requests Client-side Active Requests public
username_cache Active Cached Usernames public
openfd_objects Objects with Swapout files open public
store_digest Store Digest public
store_log_tags Histogram of store.log tags public
storedir Store Directory Stats public
store_io Store IO Interface Stats public
store_check_cachable_stats storeCheckCachable() Stats public
refresh Refresh Algorithm Statistics public
forward Request Forwarding Statistics public
cbdata Callback Data Registry Contents public
events Event Queue public
client_list Cache Client List public
asndb AS Number Database public
carp CARP information public
userhash peer userhash information public
sourcehash peer sourcehash information public
server_list Peer Cache Statistics public
config Current Squid Configuration hidden
store_log_tags Histogram of store.log tags public

https://wiki.squid-cache.org/Features/CacheManager/Index

Cache Manager objects or reports

The following table details SMP support for each Cache Manager object or report. Unless noted otherwise, an aggregated statistics is either a sum, arithmetic mean, minimum, or maximum across all kids, as appropriate to represent the “whole Squid” view.

Name Component Aggregated? Comments
menu all yes  
info Number of clients accessing cache yes, poorly Coordinator sums up the number of clients reported by each kid, which is usually wrong because most active clients will use more than one worker, leading to exaggerated values. Note that even without SMP, this statistics is exaggerated because the count goes down when Squid cleans up the internal client table and not when the last client connection closes. SMP amplifies that effect.
  UP Time yes The maximum uptime across all kids is reported
  other yes  
server_list all no, but can be If you work on aggregating these stats, please keep in mind that kids may have a different set of peers. The to-Coordinator responses should include, for each peer, a peer name and not just its “index”
mem all no, but can be If you work on aggregating these stats, please keep in mind that kids may have a different set of memory pools. The to-Coordinator responses should include, for each pool, a pool name and not just its “index”. Full stats may exceed typical UDS message size limits (16KB). If overflows are likely, it may be a good idea to create response messages so that overflowing items are not included (in the current sort order). Another alternative is to split mgr:mem into mgr:mem (with various aggregated totals) and mgr:pools (with non-aggregated per-pool details).
counters sample_time yes The latest (maximum) sample time across all kids is reported
refresh all no, but can be  
idns queue no and should not be The kids should probably report their own queues, especially since DNS query IDs are kid-specific.
  other no, but can be If you work on aggregating these stats, please keep in mind that kids may have a different set of name servers. The to-Coordinator responses should include, for each name server, a server address and not just its “index”.
histograms all no, but can be If you work on aggregating these stats, please keep typical UDS message size limits (16KB) in mind.
5min sample_start_time yes The earliest (minimum) sample time across all kids is reported
  sample_end_time yes The latest (maximum) sample time across all kids is reported.
  median yes, approximately The arithmetic mean over kids medians is reported. This is not a true median. True median reporting is possible but would require adding code to exchange and aggregate raw histograms.
  other yes  
60min all   See 5min rows for component details.
utilization all no, but can be If you work on aggregating these stats, please reuse or mimic mgr:5min/60min aggregation code.
other all varies TBD. In general, statistics inside "by kidK {...}" blobs are not aggregated while all others are.
@boynux
Copy link
Owner

boynux commented Mar 10, 2024

Hi @gberche-orange

The only way that I know we can possibly get information per URL in squid is analyzing squid logs. this exporter however only relies on squid cache object, which does not provide any information about the URLs.

I also briefly looked into the cache digest, but it seems it provides information provided a URL and not the other way around. That said there might be way to get this information from the squid cache object, likely by modifying the cache object to include for example top 100 origins or something like that but that's a large project by itself.

If you have more concrete ideas to implement this feature in this project, I'll be happy to discuss options.

Cheers

@gberche-orange
Copy link
Author

gberche-orange commented Mar 12, 2024

Thanks @boynux for your analysis.

I started searching for ways to automatically parse the squid access logs and surface this as metrics. I so far only found the following (unmaintained repos):

Would it make sense for the squid-exporter to support optional parsing of the squid log in an expected controlled format and specified path ?

@boynux
Copy link
Owner

boynux commented Jun 22, 2024

Sorry for the delay. I think each of these have their own pros and cons. I need to find some time to put together a trade off analysis and make decision based on that.

The general problem with log analysers is the added complexity to the code base and dependency on the host filesystem. One way [very hand wavy] to solve this is to make a separate interface between one of the above services and exporter is merely reading from that API and exposing it to the Prometheus instance. I think that's going to be a sizable project by itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants