-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use the LWN weekly publication date #2589
Conversation
Why not just store the datetime object on self in parse_index and return |
I have not been able to prove to myself that parse_index() will always be called prior to publication_date(). I have also not found any such guarantees in the documentation. If it is indeed something I can rely on, can you please point me to the right documentation and code? |
publication_date() is called only by create_opf() OPF creation happens |
Hi @kovidgoyal , Implemented in accordance with your suggestion. Because the date now gets parsed, I had to temporarily switch the locale to en_US to prevent breakage in some countries. |
Changing global variables in a multithreaded context is pretty dangerous, instead just parse the date manually, it should be trivial to do in 3 lines of python for a known date format. Something like: def parse_date_with_custom_mapping(date_string):
month_map = {
"January": 1, "February": 2, "March": 3, "April": 4,
"May": 5, "June": 6, "July": 7, "August": 8,
"September": 9, "October": 10, "November": 11, "December": 12
}
month_name, day_year = date_string.split()
day, year = map(int, day_year.split(','))
return datetime(year, month_map[month_name], day) |
Thanks for pointing this out! I rarely build multi-threaded Python applications. I mostly opt for asyncio workqueues, which tend to provide demonstrably better throughput (the only exception being workloads that require little to no synchronization between threads). However, building a custom date parser to avoid a thread safety issue is bad for maintainability? The software landscape is littered with the corpses of those that implement bespoke date / time infrastructure. I will rather solve the thread safety issue and use standard infrastructure for the parsing. |
This is a pretty trivial string to parse but if you dont want to write |
That is exactly what I did. I just switched to dateutil.parser.parse() which is both location independent, thread-safe, and already used by |
Bulk conversion (unfortunately) is a frequent thing when catching up after a holiday or crontab breakage. It is really annoying figuring out the LWN weekly edition reading order if they all have the same title based on the conversion date. Falls back to the current date if the date cannot be parsed. Uses dateutil.parser.parse() for thread-safety.
Borrows a trick from foreignaffairs.recipe to set the publication date in the title as opposed to the conversion date. Bulk conversion (unfortunately) is a frequent thing when catching up after a holiday or crontab breakage. It is really annoying figuring out the LWN weekly edition reading order if they all have the same title based on the conversion date.
The correct way to implement this would have been to override
publication_date()
, but this is not really set-up to be able to derive the publication date from the content. The workaround in foreignaffairs.recipe overridestimefmt
with the publication date string to get around this. I opted for the same approach.