-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: Model additional flow steps for the lxml framework #18185
base: main
Are you sure you want to change the base?
Conversation
652a9ca
to
b523be2
Compare
b523be2
to
5c8ef28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 2 out of 5 changed files in this pull request and generated no suggestions.
Files not reviewed (3)
- python/ql/lib/semmle/python/frameworks/Lxml.qll: Language not supported
- python/ql/test/library-tests/frameworks/lxml/InlineTaintTest.expected: Language not supported
- python/ql/test/library-tests/frameworks/lxml/InlineTaintTest.ql: Language not supported
Comments skipped due to low confidence (16)
python/ql/test/library-tests/frameworks/lxml/taint_test.py:50
- The taint tracking is missing through the call to
list
. Ensure the test case covers this scenario to verify taint propagation.
list(elem.findall("b"))[0].text, # $ MISSING:tainted # Type tracking is not followed through call to `list`
python/ql/test/library-tests/frameworks/lxml/taint_test.py:54
- The taint tracking is missing through the call to
next
onelem.getiterator()
. Ensure the test case covers this scenario to verify taint propagation.
next(elem.getiterator()).text, # $ MISSING:tainted
python/ql/test/library-tests/frameworks/lxml/taint_test.py:62
- The taint tracking is missing through the call to
next
onelem.iter()
. Ensure the test case covers this scenario to verify taint propagation.
next(elem.iter()).text, # $ MISSING:tainted
python/ql/test/library-tests/frameworks/lxml/taint_test.py:64
- The taint tracking is missing through the call to
next
onelem[0].iterancestors()
. Ensure the test case covers this scenario to verify taint propagation.
next(elem[0].iterancestors()).text, # $ MISSING:tainted
python/ql/test/library-tests/frameworks/lxml/taint_test.py:66
- The taint tracking is missing through the call to
next
onelem.iterchildren()
. Ensure the test case covers this scenario to verify taint propagation.
next(elem.iterchildren()).text, # $ MISSING:tainted
python/ql/test/library-tests/frameworks/lxml/taint_test.py:68
- The taint tracking is missing through the call to
next
onelem.iterdescendants()
. Ensure the test case covers this scenario to verify taint propagation.
next(elem.iterdescendants()).text, # $ MISSING:tainted
python/ql/test/library-tests/frameworks/lxml/taint_test.py:70
- The taint tracking is missing through the call to
next
onelem.iterfind("b")
. Ensure the test case covers this scenario to verify taint propagation.
next(elem.iterfind("b")).text, # $ MISSING:tainted
python/ql/test/library-tests/frameworks/lxml/taint_test.py:72
- The taint tracking is missing through the call to
next
onelem[0].itersiblings()
. Ensure the test case covers this scenario to verify taint propagation.
next(elem[0].itersiblings()).text, # $ MISSING:tainted
python/ql/test/library-tests/frameworks/lxml/taint_test.py:82
- The taint tracking is missing for
ch.text
in the for loop. Ensure the test case covers this scenario to verify taint propagation.
ch.text # $ MISSING: tainted # API node getASubscript() appears to not consider things like for loops
python/ql/test/library-tests/frameworks/lxml/taint_test.py:93
- The taint tracking is missing through the call to
next
ontree.getiterator()
. Ensure the test case covers this scenario to verify taint propagation.
next(tree.getiterator()).text, # $ MISSING:tainted
python/ql/test/library-tests/frameworks/lxml/taint_test.py:95
- The taint tracking is missing through the call to
next
ontree.iter()
. Ensure the test case covers this scenario to verify taint propagation.
next(tree.iter()).text, # $ MISSING:tainted
python/ql/test/library-tests/frameworks/lxml/taint_test.py:97
- The taint tracking is missing through the call to
next
ontree.iterfind("b")
. Ensure the test case covers this scenario to verify taint propagation.
next(tree.iterfind("b")).text, # $ MISSING:tainted
python/ql/test/library-tests/frameworks/lxml/taint_test.py:109
- The taint tracking is missing for
elem2.text
from the tuple return value. Ensure the test case covers this scenario to verify taint propagation.
elem2.text, # $ MISSING:tainted # Type is not tracked to the tuple return value
python/ql/test/library-tests/frameworks/lxml/taint_test.py:110
- The taint tracking is missing for
elem3.text
from the tuple return value. Ensure the test case covers this scenario to verify taint propagation.
elem3.text, # $ MISSING:tainted
python/ql/test/library-tests/frameworks/lxml/taint_test.py:119
- The taint tracking is missing for
tree2.getroot()
. Ensure the test case covers this scenario to verify taint propagation.
tree2.getroot() # $ MISSING:tainted
python/ql/test/library-tests/frameworks/lxml/taint_test.py:126
- The taint tracking is missing for
el.text
in the iterparse loop. Ensure the test case covers this scenario to verify taint propagation.
el.text, # $ MISSING:tainted
Tip: Copilot only keeps its highest confidence comments to reduce noise and keep you focused. Learn more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice modeling, and great use of the API graph. I like the tests also, especially the ones about what type tracking can and cannot do. I think we should document this more clearly. Currently, there are some comments to that effect in the type tracking tests.
override string getFormat() { result = "XML" } | ||
} | ||
// TODO: ElementTree.write can write to a file-like object; should that be a flow step? | ||
// It also can accept a filepath which could be a path injection sink. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can add that as a sink directly by extending the right concept? (Probably out of scope for this PR)
elem.get("at"), # $ tainted | ||
elem.getchildren()[0].text, # $ tainted | ||
elem.getiterator(), # $ tainted | ||
next(elem.getiterator()).text, # $ MISSING:tainted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next
is currently modeled as a flow summary, but not a "simple" one, so we cannot type track through it (it uses type tracking to identify the calls).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything we could expect to be able to type track through these cases of methods returning iterators?
(list(X)[0]
doesn't work; and neither do for loops as one of the other tests here shows)
|
||
ensure_tainted( | ||
func2(tree), # $ tainted | ||
func2(tree).text, # $ MISSING:tainted - type tracking not tracked through flow preserving calls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is surprising, should that be func2(tree).getroot().text
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should; fixed. However the flow is still missing.
func2(tree).text, # $ MISSING:tainted - type tracking not tracked through flow preserving calls | ||
func3(tree).text, # $ MISSING:tainted - this includes if there is a type hint annotation on the return | ||
typing.cast(ET.ElementTree, tree), # $ tainted | ||
typing.cast(ET.ElementTree, tree).text, # $ MISSING:tainted - this includes for flow summary models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will only work for "simple" summaries; those that do not use type tracking to identify calls. (And, again, is .text
the right test here? I do not see tree.text
being tainted.)
Co-authored-by: yoff <[email protected]>
Models additional flow steps for
Element
andElementTree
objects from thelxml.etree
module, to track taint through parsed XML elements.