Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shared: Add library for filepath normalization #14500

Merged
merged 4 commits into from
Oct 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
| | . |
| ./a/b/c/../d | a/b/d |
| / | / |
| /a/b/../c | /a/c |
| /a/b/c/../../d/ | /a/d |
| a/.. | . |
| a/b/../c | a/c |
| a/b//////c/./d/../e//d// | a/b/c/e/d |
| a/b/c | a/b/c |
| a/b/c/../../../../d/e/../f | ../d/f |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to add a similar test but with an absolute path (e.g. /a/b/c/../../../../d/e/../f should resolve to /d/f.

14 changes: 14 additions & 0 deletions csharp/ql/test/library-tests/utils/FilepathNormalizeTest.ql
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
import codeql.util.FilePath

class FilepathTest extends NormalizableFilepath {
FilepathTest() {
this =
[
"a/b/c", "a/b/../c", "a/..", "/a/b/../c", "a/b/c/../../../../d/e/../f", "", "/",
"./a/b/c/../d", "a/b//////c/./d/../e//d//", "/a/b/c/../../d/"
]
}
}

from FilepathTest s
select s, s.getNormalizedPath()
4 changes: 4 additions & 0 deletions shared/util/change-notes/2023-10-13-filepath-normalization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
category: feature
---
* Added `FilePath` API for normalizing filepaths.
88 changes: 88 additions & 0 deletions shared/util/codeql/util/FilePath.qll
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
/** Provides a utility for normalizing filepaths. */
joefarebrother marked this conversation as resolved.
Show resolved Hide resolved

/**
* A filepath that should be normalized.
*
* Extend to provide additional strings that should be normalized as filepaths.
*/
abstract class NormalizableFilepath extends string {
bindingset[this]
NormalizableFilepath() { any() }

/** Gets the `i`th path component of this string. */
private string getComponent(int i) { result = this.splitAt("/", i) }

/** Gets the number of path components of thi string. */
private int getNumComponents() { result = strictcount(int i | exists(this.getComponent(i))) }

/** In the normalized path starting from component `i`, counts the number of `..` segments that path starts with. */
private int dotdotCountFrom(int i) {
result = 0 and i = this.getNumComponents()
or
exists(string c | c = this.getComponent(i) |
if c = ""
then result = this.dotdotCountFrom(i + 1)
else
if c = "."
then result = this.dotdotCountFrom(i + 1)
else
if c = ".."
then result = this.dotdotCountFrom(i + 1) + 1
else result = (this.dotdotCountFrom(i + 1) - 1).maximum(0)
)
}

/** In the normalized path up to (excluding) component `i`, counts the number of non-`..` segments that path ends with. */
private int segmentCountUntil(int i) {
result = 0 and i = 0
or
exists(string c | c = this.getComponent(i - 1) |
if c = ""
then result = this.segmentCountUntil(i - 1)
else
if c = "."
then result = this.segmentCountUntil(i - 1)
else
if c = ".."
then result = (this.segmentCountUntil(i - 1) - 1).maximum(0)
else result = this.segmentCountUntil(i - 1) + 1
)
}

/** Gets the `i`th component if that component should be included in the normalized path. */
private string part(int i) {
result = this.getComponent(i) and
result != "." and
if result = ""
then i = 0
else (
result != ".." and
0 = this.dotdotCountFrom(i + 1)
or
result = ".." and
0 = this.segmentCountUntil(i)
)
}

/**
* Gets the normalized filepath for this string.
*
* Normalizes `..` paths, `.` paths, and multiple `/`s as much as possible, but does not normalize case, resolve symlinks, or make relative paths absolute.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also note only local paths are handled (correctly)? E.g.: the UNC path //server/public/../private should resolve to //server/public/private (you can't .. out of a share for security reasons), but AFAICT this would resolve it to /server/private.

*
* The normalized path will be absolute (begin with `/`) if and only if the original path is.
*
* The normalized path will not have a trailing `/`.
*
* Only `/` is treated as a path separator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean something e.g. the absolute path C:\foo\bar\baz\../quux would get normalised to quux?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe this warrants a stronger caution that DOS/Windows paths in general are unsupported as AFAICT C:/.. and even C:foo/.. incorrectly resolve to . despite using / separators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to address these cases in the documentation. However as this work was semi-blocking for #14343, and these cases are not relevant for this use case, I have decided to merge anyway. I will address these doc improvements in a follow-up PR.

*/
string getNormalizedPath() {
exists(string norm | norm = concat(string s, int i | s = this.part(i) | s, "/" order by i) |
if norm != ""
then result = norm
else
if this.matches("/%")
then result = "/"
else result = "."
)
}
}