Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add openvpn experiment spec #293

Merged
merged 5 commits into from
Jun 6, 2024

Conversation

ainghazal
Copy link
Contributor

@ainghazal ainghazal commented Mar 7, 2024

Checklist

Description

Add initial spec for the openvpn experiment.
This proposal supersedes a previous version (#255).
ICMP Pings and URLGrabber over the tunnel are not included, this experiment just does OpenVPN handshakes at the moment.

@ainghazal ainghazal force-pushed the openvpn-spec-2024 branch from 880df94 to ef370bc Compare March 7, 2024 21:39
@ainghazal
Copy link
Contributor Author

For reference: https://explorer.ooni.org/m/20240307214346.366779_MX_openvpn_7df887fc7cd3836b

"success_any": true,
"inputs": [
"openvpn://51.15.187.53:1194/udp/?provider=riseup",
"openvpn://51.15.187.53:1194/tcp/?provider=riseup"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it perhaps not possible to promote these inputs keys to a top level input key and write inside of it a single entry?

How about instead of having a list that contains in it both tcp and udp, you write in the input key the address and provider name and specify which protocols are enabled by setting an option?

For example this could look like:

{
  "input": "openvpn://51.15.187.53:1194/?provider=riseup",
  "options": [
      "UDPEnabled=true",
      "TCPEnabled=true"
  ]
}

This way we can maintain the semantic of input specifies the target of the measurement and integrate easily with the existing tooling, such as OONI Explorer, allowing users to search for things, while at the same allowing to parse which capability has been tested via the experiment by optionally parsing the options field.

That's consistent, for example, with what the URLGetter test does in setting the HTTP3Enabled=true option (see: https://explorer.ooni.org/m/20240321113140.297285_VE_urlgetter_23152e2fc6114c42)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To facilitate querying by domain in explorer, we also discussed changing the input format to look like:

openvpn://riseup.corp/?address=51.15.187.53:1194

Copy link
Contributor Author

@ainghazal ainghazal Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been trying to implement the separation of endpoint and transport.

I think this also makes sense because it will make probably very quick to see patterns like whether a given IP is blocked on all known ports (and therefore plausible that it's the IP that's blocked).

However, unless I'm mistaken, the current semantics for Options are parameters that come from command line (or oonirun descriptors), and are set globally for a single run, not something that can be parametrized for a list of inputs.

My current implementation follows more or less the loadOrQueryBackend implementation in engine.InputLoader: https://github.com/ooni/probe-cli/blob/master/internal/engine/inputloader.go#L302 and so it fetches a list of URL-like "inputs".

On the client implementation, I can convert these lists of inputs into an intermediate endpoint representation, but I think there'll be some friction in trying to modify the options for each input (I might be wrong).

If for indexing/querying purposes it doesn't need to be an Option, I could treat it like a per-endpoint parameter, in a similar way that it's done for Tor Targets: https://github.com/ooni/probe-cli/blob/master/internal/model/ooapi.go#L127 and then treat it as an internal "parameter" specific to the openvpn (or wg) experiment.

Conceptually, it would make sense to treat the IP address as the primary index for the endpoint, and pass tuples of (port, transport) as different variations to try for a single input:

api_input: {
  input: "openvpn://riseup.corp/?address=51.15.187.53",
  params: [
    {port: 1194, transport: tcp},
    {port: 1194, transport: udp},
    {port: 443, transport: tcp},
    {port: 443, transport: udp},
}

In this way, and for a single IP endpoint input, the experiment would attempt 4 handhshakes, per port-transport combination (and they'd be indexed by the transaction ID in the respective test key entries).

If this facilitates analysis, let's go for it.

For the case of riseup there're at least 4 or 5 ports available per IP. Unsure about other future providers.

It might be also possible to augment the current semantics of Options in the client to build them on a per-measurement basis. A simple way into this could be to have the experiment re-write options for each measurement Run (according to the combinations that we actually see make sense from the API response).

@hellais: is there something particular for the explorer/analysis end of things on having a parameter indexed as an Option, or could an experiment-specific parameter field also do?

Copy link
Contributor Author

@ainghazal ainghazal Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW:

Is it perhaps not possible to promote these inputs keys to a top level input key and write inside of it a single entry?

I just noticed that this review was based on an outdated spec. I know we're discussing spec and not implementation here, but for the record what the current state of the experiment does is this:

  1. Query the new endpoint in /v2/ooniprobe/vpn-config/riseup
  2. Session/InputLoader fetches two main bits from the API:
    • A per-provider map of configuration options (including credentials), which is cached at the session level.
    • A list of known endpoints to test.
  3. Engine will run a SINGLE measurement PER INPUT, so with the current state of things in the integration branch there's ONE SINGLE MEASUREMENT per port and transport.

What we're discussing here is to aggregate different transports (and potentially, ports) into a single measurement.

Copy link
Contributor Author

@ainghazal ainghazal Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, I do see some advantage to sticking to a single measurement measuring a single "target", i.e., a given port/transport combination. 🤔

For one, it flattens the observation, and it might make way cheaper to identify cases where we get evidence that we've got a longitudinal failure that might be attributable to the provider, and not network interference.

It depends a lot on how the provider handles this, but it might be safe to assume processes listening on different ports are independent. For booleans it's not likely it makes a difference (blocked/not), but for other metrics the service properties are better isolated to the process level.

"failure": "",
"success": true
},
"handshake_start_time": "2024-03-07T21:52:09.351623009+01:00",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestamp here should be formatted as UTC timestamp and using Z to indicate it's in the UTC timezone.

For example:

2024-03-07T22:52:09.351623009Z

Copy link
Member

@hellais hellais Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also actually maybe a bit redundant to have also a handshake_start_time given that you are already recording this information, I assume, as part of the t value which is a monotonic clock that expresses the offset to the overall measurement_start_time.

Edit: I understand that the t0, t values are computed as offsets compared to handshake_start_time, but I do wonder if instead you could make them relative to measurement_start_time as that would make it more compact and easier to reason about within a particular testing session. ooni/data pipeline is capable of converting t and t0 values as absolute time, however it uses as reference the measurement_start_time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed as per suggestion.

},
"handshake_start_time": "2024-03-07T21:52:10.051371358+01:00",
"t0": 0.222041219,
"t": 0.938619976,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this is a real measurement, but assuming it is, it looks to me like there might be some issue (or maybe some misunderstanding on my end) of the values of t0, t and bootstrap_time. I would expect bootstrap_time to be equal to t - t0, while that is not the case in this particular measurement entry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right, this was wrong. changed the semantics to t-t0 in the reviewed version.

@bassosimone bassosimone changed the title add openvpn experiment spec proposal (2024 version) feat: add openvpn experiment spec Jun 6, 2024
Copy link
Contributor

@bassosimone bassosimone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💥

@bassosimone bassosimone merged commit 5946b96 into ooni:master Jun 6, 2024
bassosimone added a commit to ooni/probe-cli that referenced this pull request Jun 6, 2024
# Description

First iteration of a new openvpn experiment. This takes a set of
endpoints and uses minivpn to attempt a handshake with each of the
configured endpoints.

A new API endpoint has been added to the backend that is able to
distribute valid configuration parameters for an OpenVPN connection,
including credentials.

# Checklist

- [x] I have read the [contribution
guidelines](https://github.com/ooni/probe-cli/blob/master/CONTRIBUTING.md)
- [x] reference issue for this pull request:
ooni/probe#2688
- [x] if you changed anything related to how experiments work and you
need to reflect these changes in the ooni/spec repository, please link
to the related ooni/spec pull request:
ooni/spec#293
- [x] if you changed code inside an experiment, make sure you bump its
version number

---------

Co-authored-by: Simone Basso <[email protected]>
@ainghazal ainghazal mentioned this pull request Jun 6, 2024
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants