Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_repr and rich breaks logging #655

Open
Eldeeqq opened this issue Sep 25, 2024 · 14 comments
Open

to_repr and rich breaks logging #655

Eldeeqq opened this issue Sep 25, 2024 · 14 comments

Comments

@Eldeeqq
Copy link

Eldeeqq commented Sep 25, 2024

Structlog version: 24.4.0

⭐ First of all - love this project, huge fan, good job.

Hello,

Maybe this is just bad setup on our end, but it was tricky to figure out so I decided to raise it.

We use structlog for logging data from our service in json format, and then uploading it into DataDog. After updating structlog to version 24.4.0 our logging started to break each time we logged an exception ( structlog.get_logger().exception(...))

I was able to trace the root cause to this PR: #627, where the implementation for to_repr changed to:

def to_repr(
    obj: Any,
    max_length: int | None = None,
    max_string: int | None = None,
    use_rich: bool = True, # source of the problem
) -> str:

and I was able to hotfix the the problem on our end by creating our own dict_tracebacks and setting use_rich explicitly.

custom_dict_tracebacks =  ExceptionRenderer(ExceptionDictTransformer(use_rich=False))
Stack Trace
File "/usr/local/lib/python3.12/site-packages/structlog/_base.py", line 165, in _process_event

event_dict = proc(self._logger, method_name, event_dict)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/usr/local/lib/python3.12/site-packages/structlog/processors.py", line 412, in __call__

event_dict["exception"] = self.format_exception(
^^^^^^^^^^^^^^^^^^^^^^

File "/usr/local/lib/python3.12/site-packages/structlog/tracebacks.py", line 415, in __call__

trace = extract(
^^^^^^^^

File "/usr/local/lib/python3.12/site-packages/structlog/tracebacks.py", line 276, in extract

key: to_repr(
^^^^^^^^

File "/usr/local/lib/python3.12/site-packages/structlog/tracebacks.py", line 147, in to_repr

obj_repr = rich.pretty.traverse(
^^^^^^^^^^^^^^^^^^^^^

File "/usr/local/lib/python3.12/site-packages/rich/pretty.py", line 853, in traverse

node = _traverse(_object, root=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/usr/local/lib/python3.12/site-packages/rich/pretty.py", line 767, in _traverse

child_node = _traverse(getattr(obj, field.name), depth=depth + 1)
^^^^^^^^^^^^^^^^^^^^^^^^

AttributeError: 'MemoryObjectItemReceiver' object has no attribute 'item'

Since we don't install rich on the server, I'm not really sure how it was able to import it in the first place (my guess is transitive dependency), or why it failed.

But would it make sense to somehow configure the use_rich flag globally?

@hackermandh
Copy link

Do you install bandit? Because that has rich as a dependency (just because they have a one-liner where they wanted to show a loading bar 🥲)

@Eldeeqq
Copy link
Author

Eldeeqq commented Nov 28, 2024

I checked which modules are installed on the server and neither bandit or rich are being installed. Might've been different when I created this issue 🙁

@hynek
Copy link
Owner

hynek commented Nov 30, 2024

⭐ First of all - love this project, huge fan, good job.

Thank you! 💛

We use structlog for logging data from our service in json format, and then uploading it into DataDog. After updating structlog to version 24.4.0 our logging started to break each time we logged an exception ( structlog.get_logger().exception(...))

What you mean here, is that you expect single-line JSON lines and Rich formating breaks them into multiple ones? (is this expected @sscherfke?)

I was able to trace the root cause to this PR: #627, where the implementation for to_repr changed to:

def to_repr(
    obj: Any,
    max_length: int | None = None,
    max_string: int | None = None,
    use_rich: bool = True, # source of the problem
) -> str:

and I was able to hotfix the the problem on our end by creating our own dict_tracebacks and setting use_rich explicitly.

custom_dict_tracebacks =  ExceptionRenderer(ExceptionDictTransformer(use_rich=False))

I don't think that's a hot fix – I think you're using configuration as it's intended? 😅

Since we don't install rich on the server, I'm not really sure how it was able to import it in the first place (my guess is transitive dependency), or why it failed.

Not to be lecturing, but you should use recursive lock files to make this deterministic. 🤓

But would it make sense to somehow configure the use_rich flag globally?

We try to avoid having anything global except the configuration and you were able to work around the problem using regular config, so I'm gonna say everything is working as intended?

Depending on what Stefan says, we can have a look at multi-line log lines or whatever is happening there, but adding a global variable to avoid configuration is not our MO. :)

@Eldeeqq
Copy link
Author

Eldeeqq commented Dec 2, 2024

We use structlog for logging data from our service in json format, and then uploading it into DataDog. After updating structlog to version 24.4.0 our logging started to break each time we logged an exception ( structlog.get_logger().exception(...))

What you mean here, is that you expect single-line JSON lines and Rich formating breaks them into multiple ones? (is this expected @sscherfke?)

Not entirely. Yes, we expect inline JSON. But whenever we logged an exception, another exception was raised while logging it (I'm not sure about the exact trace content since our logs get removed after some time 🙁). This second exception was raised from the to_repr function and wasn't formatted the same way as the others. I suspect this happened because we didn't have sys.excepthook set up properly at the time.

I double-checked the dependencies on the server, and rich is indeed not installed. My hunch is that structlog tried to use rich despite being unable to import it.

I should have explained that better, sorry.

I was able to trace the root cause to this PR: #627, where the implementation for to_repr changed to:

def to_repr(
    obj: Any,
    max_length: int | None = None,
    max_string: int | None = None,
    use_rich: bool = True, # source of the problem
) -> str:

and I was able to hotfix the the problem on our end by creating our own dict_tracebacks and setting use_rich explicitly.

custom_dict_tracebacks =  ExceptionRenderer(ExceptionDictTransformer(use_rich=False))

I don't think that's a hot fix – I think you're using configuration as it's intended? 😅

Yeah, I guess I meant hotfixing the logging within our service by configuring structlog differently 😁

Since we don't install rich on the server, I'm not really sure how it was able to import it in the first place (my guess is transitive dependency), or why it failed.

Not to be lecturing, but you should use recursive lock files to make this deterministic. 🤓

I agree. I'm not OG dev on that project and I'm currently pushing for using uv/poetry.

But would it make sense to somehow configure the use_rich flag globally?

We try to avoid having anything global except the configuration and you were able to work around the problem using regular config, so I'm gonna say everything is working as intended?

I understand. Yes, I managed to configure it correctly. However, as I mentioned at the start - I think the problem was caused by structlog attempting to use rich when it wasn't installed. I haven't investigated thoroughly, but there might be a missing dependency check for rich in structlog.

Depending on what Stefan says, we can have a look at multi-line log lines or whatever is happening there, but adding a global variable to avoid configuration is not our MO. :)

Sure thing, thanks for your time!

@sscherfke
Copy link
Contributor

rich should only be used for generating a nicer repr when it is actually available. There shouldn't be any errors caused by rich not being installed.

I'll take a look at it. A (cleaned) traceback from your logs would be really helpful though. Maybe you can reactivate the old behavior on a test system until you get the desired exception?

@Eldeeqq
Copy link
Author

Eldeeqq commented Dec 2, 2024

Sure thing! I'll try to reproduce the bug and share the trace :)

@sscherfke
Copy link
Contributor

sscherfke commented Dec 6, 2024

This looks like #678 – Rich does not handle exceptions raised during traceback rendering. This applies to the ConsoleRenderer as well.

Something like this might be a fix:

diff --git a/src/structlog/tracebacks.py b/src/structlog/tracebacks.py
index 2331b2f..2607fc3 100644
--- a/src/structlog/tracebacks.py
+++ b/src/structlog/tracebacks.py
@@ -142,29 +142,32 @@ def to_repr(
        implementation.
     """
     if use_rich and rich is not None:
-        # Let rich render the repr if it is available.
-        # It produces much better results for containers and dataclasses/attrs.
-        obj_repr = rich.pretty.traverse(
-            obj, max_length=max_length, max_string=max_string
-        ).render()
-    else:
-        # Generate a (truncated) repr if rich is not available.
-        # Handle str/bytes differently to get better results for truncated
-        # representations.  Also catch all errors, similarly to "safe_str()".
-        try:
-            if isinstance(obj, (str, bytes)):
-                if max_string is not None and len(obj) > max_string:
-                    truncated = len(obj) - max_string
-                    obj_repr = f"{obj[:max_string]!r}+{truncated}"
-                else:
-                    obj_repr = repr(obj)
-            else:
-                obj_repr = repr(obj)
-                if max_string is not None and len(obj_repr) > max_string:
-                    truncated = len(obj_repr) - max_string
-                    obj_repr = f"{obj_repr[:max_string]!r}+{truncated}"
-        except Exception as error:  # noqa: BLE001
-            obj_repr = f"<repr-error {str(error)!r}>"
+        try:
+            # Let rich render the repr if it is available.
+            # It produces much better results for containers and dataclasses/attrs.
+            return rich.pretty.traverse(
+                obj, max_length=max_length, max_string=max_string
+            ).render()
+        except Exception:
+            pass
+    # Generate a (truncated) repr if rich is not available.
+    # Handle str/bytes differently to get better results for truncated
+    # representations.  Also catch all errors, similarly to "safe_str()".
+    try:
+        if isinstance(obj, (str, bytes)):
+            if max_string is not None and len(obj) > max_string:
+                truncated = len(obj) - max_string
+                obj_repr = f"{obj[:max_string]!r}+{truncated}"
+            else:
+                obj_repr = repr(obj)
+        else:
+            obj_repr = repr(obj)
+            if max_string is not None and len(obj_repr) > max_string:
+                truncated = len(obj_repr) - max_string
+                obj_repr = f"{obj_repr[:max_string]!r}+{truncated}"
+    except Exception as error:  # noqa: BLE001
+        obj_repr = f"<repr-error {str(error)!r}>"
 
     return obj_repr

@hynek
Copy link
Owner

hynek commented Jan 4, 2025

So, the idea is to render it ourselves if Rich explodes?

@sscherfke
Copy link
Contributor

Yes, similar to the safe_str() function: https://github.com/hynek/structlog/blob/main/src/structlog/tracebacks.py#L104-L109

@hynek
Copy link
Owner

hynek commented Jan 4, 2025

So safe_repr just announced? 😅

as you might’ve noticed I’m fighting myself currently thru the backlog. Given we’ve had boneless than two (maybe three) reports in this vein, I thinking would be good to avoid it for 25.1.0.

@sscherfke
Copy link
Contributor

sscherfke commented Jan 4, 2025

The custom to_repr hook that is used when Rich is not available already has a similar safeguard, but it is not used in case Rich is used (and fails): https://github.com/hynek/structlog/blob/main/src/structlog/tracebacks.py#L169

@sscherfke
Copy link
Contributor

Looked at the code again. I think that we can fix this relatively easy for the dict transformer (for json logging). But fixing it for the console renderer is not that easy, because we basically do a print(rich.traceback.Traceback.from_exception(exc)), so there is no way to add a safe guard around their to_repr().

The probability that JSON logging is used in someones prod env is relatively high so imho we should fix this use case. If any local object has a broken __repr__() implementation, this would break the exception log which would make debugging exceptions in a prod system rather hard. ;-)

@hynek
Copy link
Owner

hynek commented Jan 5, 2025

Looked at the code again. I think that we can fix this relatively easy for the dict transformer (for json logging). But fixing it for the console renderer is not that easy, because we basically do a print(rich.traceback.Traceback.from_exception(exc)), so there is no way to add a safe guard around their to_repr().

Falling back to a simple formatter in dev would be better than a crashing I guess?

@sscherfke
Copy link
Contributor

Tried to write a regression test for this and realized that rich already has a safe guard for erroneous __repr__() implementations.

And the actual error occurs when rich traverses the objects graph. In this case, it tries to iterate over the fields of a dataclass: https://github.com/Textualize/rich/blob/v13.9.4/rich/pretty.py#L788

For some reasons (that I didn't look up), the MemoryObjectItemReceiver instance has no attribute item even though the fields() function returned a field with that name.

So it is definitely a rich issue and we can't really to anything about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants