Skip to content

Commit

Permalink
Wrap up detailed responses
Browse files Browse the repository at this point in the history
  • Loading branch information
jeremie-koenig committed Sep 21, 2024
1 parent d71f1da commit 57c4ef9
Showing 1 changed file with 79 additions and 141 deletions.
220 changes: 79 additions & 141 deletions compcertoe/popl25/popl2025-response-239.mdwn
Original file line number Diff line number Diff line change
@@ -1,9 +1,4 @@
> POPL 2025 Paper #239 Reviews and Comments
> ===========================================================================
> Paper #239 Unifying compositional verification and certified compilation
> with a three-dimensional refinement algebra
We are grateful for the reviewers' hard work and feedback.
We are grateful for the reviewers' hard work and valuable feedback.
Since cross-cutting concerns include the motivation and framing of our contributions,
the relationship of our work to other relevant research, and the status of our
implementation, we will discuss these issues first. We then include a proposed
Expand Down Expand Up @@ -269,64 +264,28 @@ we address each reviewer's remaining questions and observations individually.
> Review #239A
> ===========================================================================
>
> Overall merit
> -------------
> 3. Weak accept
>
> Reviewer expertise
> ------------------
> 4. Expert
>
> Paper summary
> -------------
> This paper systematizes some recent results in modular verification of software, producing a unified framework for proving interactive systems with modules written in different languages and compiled by verified compilers. The central ideas are very much in the vein of interaction trees, but there are a lot of details to get right.
[TODO item #1: comparison with interaction trees]

> (...)
>
> Comments for authors
> --------------------
> This sort of unified theory is critical to application at scale of the most rigorous formal methods, and I'm impressed at the thoroughness of the design effort reported in the paper. One strange thing is that the paper doesn't include an evaluation section; we don't get a clear description of which verifications have been carried out, which is handy to validate design choices.
We intend §6
to demonstrate the suitability of our framework
for a range of practical verification tasks
and serve as an evaluation of sorts.
We have fully mechanized in Coq the framework described in the paper
and will submit a polished artifact covering all of our results,
including the applications discussed in §6.
The preliminary artifact we submitted
shows that our framework's compositional structures
can be used to facilitate them.

With that said,
we acknowledge that the paper should go into more details
regarding the actual Coq development,
both regarding the framework itself
and when it comes to the applications discussed in §6.
As a start, we can mention the following:

* The framework described in §2--5 can be mechanized in approximately
X,000 lines of Coq definitions and Y,000 lines of proof, per `coqwc`.
* As we stated in the paper, CompCertO does not require any change.
[XXX Yu how true is that?]
We deliberately sought to design a framework whose capabilities
would encompass those of the CompCertO model. As a result the
language interfaces, language semantics, simulation conventions and
correctness results provided by CompCertO can embedded
as effect signatures, strategies, refinement conventions and
refinement squares in a structure-preserving way,
and used in the context of our framework.
* ...
for a range of practical verification tasks,
and to serve as an evaluation of sorts,
but we acknowledge the implementation should be discussed in more detail.
We hope the clarifications under §III above were helpful.

> The previous papers that this project built on were notable for significant mechanized case studies of software artifacts of inherent interest, and the lack of such validation here gives the work less appeal. I also think the paper oversells the novelty of addressing this particular combination of problems.
We have tried to articulate in the two paragraph ll. 85--97 that
existing work does address this combination of problem
(certainly the Bedrock work you reference should be mentioned there),
and to explain why we believe a more systematic approach
would be beneficial.
However it is true that the framing could be improved
and [.... more accurate ...]
would be beneficial,
but we will clarify our contributions and better frame our work
per the discussion under §I above.

> # Crisply formulating what new problem is being solved
>
Expand All @@ -346,21 +305,22 @@ and [.... more accurate ...]
> 2. This paper and the ones it builds most closely on work with CompCert and roughly full-ISO-standard C, while Bedrock2 fixes a simple C subset with a simple semantics (e.g., assembly-style flat memory model).
> 3. The formalism supports a flexible notion of refinement even for infinitely reactive modules, which may allow well-structured proof of more interestingly interactive systems than in Bedrock2, which has only been shown to support interactive systems as event loops where each event handler terminates.
> 4. It seems that some higher-level results about multiple programs piped together UNIX-style can be proved, though it is unclear, from the writing of the paper, if those mechanized proofs have been completed. (An example like Fig. 1's is easily verified modularly [down to program binaries with their separate specifications] with the techniques that the cited papers demonstrated, though I don't think they demonstrated reasoning about groups of processes running simultaneously. I would not be surprised either way if an extra layer to reason about process compositions turned out to be simple or to require serious new research.)
>
> Though actually, real UNIX process compositions involve concurrent executions that synchronize through byte streams, while this paper seems to study a restricted notion of piping that looks like method calls across processes, where only one process runs instructions at a time. In that sense, Fig. 1 may be misleading when it comes to what class of systems can be verified with the new ideas. Presumably generalization to truly concurrent process execution is nontrivial future work.
Thank you for this detailed comparison.
We will be sure to incorporate the work you reference
and the points you raise into
the relevant parts of the *Introduction* and *Related Work* sections.
We will incorporate these points into
the relevant parts of the *Introduction* and *Related Work* sections,
and hope you found our above discussion of the PLDI'24 Erbsen et al. work useful as well.

It is true that supporting concurrency would be necessary
> Though actually, real UNIX process compositions involve concurrent executions that synchronize through byte streams, while this paper seems to study a restricted notion of piping that looks like method calls across processes, where only one process runs instructions at a time. In that sense, Fig. 1 may be misleading when it comes to what class of systems can be verified with the new ideas. Presumably generalization to truly concurrent process execution is nontrivial future work.
Yes this is true,
and we will clarify that we do not claim
to provide a complete and accurate model of Unix processes.
Our goal with Fig. 1 is to illustrate some of the issues that come up
Rather, our goal with Fig. 1 is to illustrate some of the issues that come up
when the horizon of verification is pushed beyond the boundary
of a fixed language or model,
yet have an example that remains simple and that
most reader will be familiar with.
with an example that remains simple and that
most readers would find familiar and easy to understand.

> So, overall, I think the framework presented here may represent a strong contribution, but the conditions of its novelty and payoff over other ideas should be reformulated, in my opinion.
>
Expand All @@ -373,7 +333,7 @@ They are very flexible in this regard and can handle:
* *relational* ($n$-to-$m$) abstraction relationships between
source- and target-level moves (questions and answers),
which are needed in particular to formulate
the CompCertO calling convention;
CompCertO simulation conventions;
* relationships that evolve through time and take into account
the history of the computation;
* different degrees of state encapsulation between the source
Expand Down Expand Up @@ -464,31 +424,14 @@ and escape the need for full-blown dual nondeterminism.
> L1180: "languauges"
>
> L1202: "consist solely events": missing "of"
>
>
>
Thank you for noting those,
we will be sure to fix them.

> Review #239B
> ===========================================================================
>
> Overall merit
> -------------
> 3. Weak accept
>
> Reviewer expertise
> ------------------
> 3. Knowledgeable
>
> Paper summary
> -------------
> This paper presents a refinement system that allows the combination of various proofs at different levels of abstraction. The system uses an almost game semantics like approach with traces over questions and answers. The program receives questions from its context and can provide an answer or a question in response. If it questions the context, then the context must provide an answer before the program continues. This is like the modelling of higher-order functions in game semantics.
>
> The system provides operations to compose components by
> * routing questions from one component to another component, and
> * by allowing two components to handle the questions.
>
> It also provides refinement on components by relating actions of one component to actions of another component, and by allowing internal state access to be hidden.
>
> The overall system seems quite compelling, however, I found the paper hard to follow in places, and do not feel confident that I fully understood the results. I would love to read an improved version of this paper.
> (...)
>
> Comments for authors
> --------------------
Expand All @@ -514,7 +457,7 @@ and escape the need for full-blown dual nondeterminism.
Thank you for articulating this and for your suggestion in how to address it.
We agree this would improve the presentation significantly
and will adopt this approach in any revision.
and would adopt this approach in the revised version.

> # Related work
>
Expand All @@ -524,94 +467,89 @@ and will adopt this approach in any revision.
We acknowledge that the discussion of related work and the broader context
is insufficient and we would need to improve this aspect in the final version.

In regards to the specific lines of work you mention:

* We would indeed consider the horizontal fragment of our framework
a simple form of game semantics, where effect signatures serve as
a very restricted form of game (one question followed by one answer).
Future work could explore whether the vertical and spatial components
of our framework could be extended to support more complex games.
* The refinement calculus is very relevant as well, and part of our
motivation is indeed to explore how some of the ideas transpose to the
richer / more interactive world of game semantics. As discussed above,
in formulating refinement conventions we have gone to some length to
figure out how to retain the relational abstraction capabilities of the
refinement calculus while avoiding the unrestricted forms of dual
nondeterminism found in the predicate transformer and
multirelation-based models.

[XXX say something about CKA?]
We hope our commentary in §II above was helpful
and would be sure to discuss all three models you mention.

> # Minor comments
>
> Line 275: _strategy_ it is not clear what this is at this point. You don't provide enough information here to get an intuition of what a strategy is. I think you need to give more semantic definition at this point. I was wondering is it a state machine, a set of traces/trees of actions, a game?
>
You are right, we will include some high-level overview of basic game semantics notions at this point.

> Fig 5. This ring buffer can overflow. You should say why this doesn't matter.
>
Right. The available space is sufficient to implement a queue with capacity at most $N$.

> Line 389 "$E_1$ and $E_1$"
>
> Line 417: $\iota$ what is this? Is it an injection into a sum type?
>
Yes! We will clarify.

> Line 567 $\underline{m}n\setminus \sigma$ is this all the traces in $\sigma$ which begin $\underline{m}n$ with that part removed from the trace? You need to explain this here.
>
Yes, sorry for this oversight. We will explain.

> Line 579: $\underline{m}_1 \in \sigma$ means there exists a starting with $\underline{m}_1$. This works because the strategies are prefix closed?
>
Correct.

> Line 579 $\underline{m}_1$ I think really you should have $\underline{m_1}$ as you have really defined underlining as part of the syntax, so you are applying it to the $m_1$ identifier.
>
This is a good point. We will do that.

> Line 617: "until pays of" should be "plays"
>
> Line 644 $q_2 \setminus \tau$ I guess it is okay for this to be the empty set, as this will never match one of the $epsilon$ cases?
>
Yes! If $\tau$ does not trigger the move $q_2$, this will be empty and cause the simulation to fail.

> Line 693: $\iota$ assuming this is some form of injection into a sum type, why is it only required on the first sum type. Why are their no injections on $m_2$, $n_1$ and $n_2$?
>
Sorry, we mixed up this definition with the tupling of
$R_1 : E \leftrightarrow F_1$ and
$R_2 : E \leftrightarrow F_2$ into
$\langle R_1, R_2 \rangle : E \leftrightarrow F_1 \oplus F_2$.
We will sort things out.

> Line 705: "in the Appendix A", drop either "the" or "A".
>
> Line 718: $@$ is this just syntax for pairing here?
>
In $m@u$, yes! We will clarify.

> Line 900: $Y^{s_1}$: here $s_1$ is from $S$, but I was expecting the superscript to be from $A$ based on the type?
>
We will clarify by describing $Y$ in the list above (l. 898) as
"a family $(Y^s)_{s \in S}$ of relations $Y^s \subseteq A^\bullet \times S$".

> Line 989: "mange"
>
> Line 989: "memory shared passed"
>
>
>
Thank you for the remaining issues you found,
we will be sure to correct them.

> Review #239C
> ===========================================================================
>
> Overall merit
> -------------
> 3. Weak accept
>
> Reviewer expertise
> ------------------
> 3. Knowledgeable
>
> Paper summary
> -------------
> This paper proposes a new compositional semantics framework, which allows us to reason about multi-language systems, where each language has its own (operational) semantics and may or may not be compiled to another language, and where programs in various languages interact with each other. The problem is that although we know how to reason about one language in isolation, or about a compiler from one language to another, or even about interactions between abstract processes, there is no universal formalism that allows us to do all these uniformly. In that vein, the paper introduces a generic semantic model based on refinement squares, which combines horizontal, vertical and spatial composition. The model is shown to be versatile enough to capture / embed various semantics and verification proofs in the context of the CompCert certified compilation framework. The work has been formalized in Coq (but the reviewers cannot see it).
>
> To describe interfaces between components, the paper uses effect signatures, which are essentially sets E of questions with predefined sets of answers, ar: E -> Set, which can be regarded as black-boxes that one can interact with through a given interface.
>
> To describe the behavior of program components, strategies are used, where a strategy L connects two effect signatures, say L : E -->> F. The intuition is that a strategy abstractly captures what a program component does by telling how to model operations / questions of F using operations of E. Strategies can be naturally composed.
>
> To capture the fact that a component can be viewed at different levels of abstraction by other components in the system, a new notion of refinement convention between effect signatures is introduced, which generalizes an existing notion of simulation convention used in CompCertO.
>
> The three ingredients above lead to a natural notion of refinement square / property / proof, which says that two strategies at two different levels of abstraction are strongly connected, in that no matter if one of the strategies is applied first followed by the other refinement abstraction, or viceversa, we get the same behavior.
>
> Now refinement squares can combine horizontally along refinement edges, or vertically along strategy edges, or spatially by adding structural / state framing, and this way have a uniform way to capture various formal verification properties of interest and compose them.
>
> The work is formalized in Coq, which supposedly means that all the theorems about CompCert, CompCertO, CompCertX, and CLight claimed in the paper are mechanically proved. If I understand it correctly, one particular aspect of this work is that the existing operational semantics of the involved languages does not need to be modified. If this is indeed true, then this is a major benefit.
> (...)
>
> Comments for authors
> --------------------
> The paper was pleasant to read initially, but it became heavier and heavier later on. That is expected normally, but what made it heavier to digest, I believe, is the lack of motivation once the technical details started. The example in Fig 1 is very simple and can be handled many different ways using existing solutions. There may not be any one particular approach that handles the example completely, but combinations of approaches and tools can certainly work together to prove it, even modulo acceptable assumptions. Then the discussion jumped to CompCertO and CLight and claimed that the same approach applies to those. This creates a gap for the reader who does not know those semantics and their challenges. For example, I am not convinced that their simulation convention was not good enough and required to be generalized into a refinement convention.
>
(We hope the discussion of CompCertO under §II above was useful in this regard.)

> Once the gap happened, then it was harder and harder to build a mental model for the refinement squares that stayed for the rest of the paper. Since there are so many different ways to construct such refinement squares, one wonders what is the benefit of enforcing them first place. Why not let formal verification engineers formalize the various levels of abstraction in their adhoc way, on a case by case basis. For the example in Fig 1, then one can certainly formalize the semantics of read/write buffers and external interaction, etc., and then still prove the desired property somehow compositionally, using the existing semantics of x86 and C and the guarantees offered by CompCert. Yes, it could be a bit more tedious, but it gives the verification engineer full freedom of choice for abstractions, without having to obey the required formalism with refinement squares, refinement conventions, etc.
>
>
> I give the paper a positive score because I believe this line of work is important and generalizations like those proposed are important. I have reservations, though, because: 1. the Coq code was not available, so we don't know how complex the integration with CompCert was and how non-intrusive wrt the existing operational semantics; and 2. other applications outside of C were not attempted. Also, at a very high level, I am not convinced that formal verification is not more broadly used because of missing types of compositionality that the paper resolves; lack of automation and heavy formalisms trump compositionality, I believe.
Thank you for articulating this.
We will do our best to improve the presentation and
hope that our discussion in the main part of this response
addresses some of your objections and concerns.

> Finally, I have a curiosity question: does the framing property really hold for the actual C language? Functions like malloc have different behaviors depending on how much memory is available, no?
The CompCert memory model does not enforce any availability constraints
Expand Down

0 comments on commit 57c4ef9

Please sign in to comment.