Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement request for touch events in XCUITest driver #496

Open
krishtoautomate opened this issue Apr 26, 2021 · 43 comments
Open

Enhancement request for touch events in XCUITest driver #496

krishtoautomate opened this issue Apr 26, 2021 · 43 comments

Comments

@krishtoautomate
Copy link

krishtoautomate commented Apr 26, 2021

I have seen many videos, how ios devices are supporting connecting with wired/wireless mouse.

can webdriveragent simulate similar mouse events which will allow smoother touch events compared to current touch actions.

referrence: https://ios.gadgethacks.com/how-to/use-wireless-usb-mouse-your-iphone-ios-13-0198390/

@KazuCocoa
Copy link
Member

I don't know about it. Possibly if you could find private APIs to achieve it.
(Probably we only can change preferences listed in the linked ones can do by manual because of security limitation like bluetooth)

@krishtoautomate
Copy link
Author

also possible with usb connection.

https://ios.gadgethacks.com/how-to/use-wireless-usb-mouse-your-iphone-ios-13-0198390/#:~:text=The%20feature%20in%20iOS%2013,%22Devices%22%20under%20Pointer%20Devices.

so it works if found an api that responds to mouse clicks of usb device

@krishtoautomate
Copy link
Author

vysor already implemented it but they used bluetooth instead of usb connection.

@krishtoautomate
Copy link
Author

Reference : https://bluetooth-keyboard.com/support/

@nanoscopic
Copy link

I investigated this previously and the mouse support via USB is relative only. It does not support absolute style coordinates such as with a digitizing tablet. As a result it is not useful for clicking positions absolutely, which is what Appium needs.

@krishtoautomate What does the AssistiveTouch example code you placed here do?

@krishtoautomate
Copy link
Author

krishtoautomate commented Jul 30, 2021

Possible to connect via bluetooth?
Not sure if i understand absolute or relative here. Can you please explain

@nanoscopic
Copy link

Others have told me they were able to make synthetic bluetooth devices easily. I haven't seen an open source example of it yet though. Also, I haven't researched it further as it is problematic to use wireless methods in a device farm. Many devices in close proximity tends to cause havoc with anything wireless.

If you take a look at the tweaked mouse interactions I have in my fork of WDA, you can see that I've simply routed calls more directly to event synthesis without all the conditional logic of current WDA. Doing this makes mouse calls take effect within less than a fifth of a second. This should be sufficient for most needs.

You can see the code for a coordinate based tap here: https://github.com/nanoscopic/WebDriverAgent/blob/master/WebDriverAgentLib/Routing/NNGServer.m#L221
That in turns calls this: https://github.com/nanoscopic/WebDriverAgent/blob/master/WebDriverAgentLib/Categories/XCUIDevice%2BCFHelpers.m#L18

For tapping on an element, one should be able to directly call the official XCUITest method [element click]. You can see the documentation from Apple here: https://developer.apple.com/documentation/xctest/xcuielement

WDA is doing all sorts of conditional logic to support older versions of the XCUITest API. This results in much slower actions. If you want speed, you should follow the current updated XCUITest API as closely as possible and avoid all the hacky workarounds of WDA.

In my opinion, WDA should be rewritten to be a very thin passthrough wrapper around official current XCUITest API as much as possible. Some actions require using the private API, which is unavoidable, but ones that don't should follow the standard.

Instead of embedding all the logic to translate "WebDriver" calls into the official API being built into WDA, the agent running on device should only be a thin wrapper to allow dynamic calls. Translation from WebDriver protocol to "WDA API" should be done off device as a separate translation layer.

@KazuCocoa
Copy link
Member

one should be able to directly call the official XCUITest method [element click]

Is this https://developer.apple.com/documentation/xctest/xcuielement/1500316-click which is available since iOS 15+ and macOS?

closely as possible and avoid all the hacky workarounds of WDA.

Yes. We were able to remove some hacky code by XCTest(UI)'s improvements. (I forgot concrete places, but still some workarounds were needed to handle elements/actions). We hope eventually can remove them and WDA becomes a more thin wrapper for XCTest directly.

@nanoscopic
Copy link

No, I mistyped. The equivalent to what WDA is doing is tap: https://developer.apple.com/documentation/xctest/xcuielement/1618666-tap
Available for iOS 9+

It isn't necessary to demand iOS 15+ to make some major improvements.

@KazuCocoa
Copy link
Member

KazuCocoa commented Jul 31, 2021

Oh, I see. So, like

?
(just searched in this repository)

@nanoscopic
Copy link

More to the point, the line above is incorrect according to Apple documentation.

if (SYSTEM_VERSION_GREATER_THAN_OR_EQUAL_TO(@"13.0")) {
[self tap];

The line above is limiting tap to only be used for iOS 13+, when the docs show it works for iOS 9.2+

This is a crucial code path in WDA, with no explanation on why the elementary method isn't be used except for on newer iOS.

@KazuCocoa
Copy link
Member

KazuCocoa commented Jul 31, 2021

mm, yea. The code blam pointed 3df8bad , but the comment was removed and the removal was not in the blam history...

So, it has an issue in Xcode 11 and iOS 12- env. If we only can support newer versions, then we can only refer to tap. Probably iOS 15 release will be good timing to bump minimum support versions in Appium and remove like this hacky code. (Should care about the removal though)

@nanoscopic
Copy link

What is the point of supporting Xcode previous to current released non-beta version? Shouldn't this be changed now? It seems to me it should have been changed once it was known it works in Xcode 12 on iOS 9.2+, and Xcode 12 was out of beta.

How many other calls have hacky workarounds that aren't needed in current Xcode?

@KazuCocoa
Copy link
Member

KazuCocoa commented Jul 31, 2021

So far, we have a policy to support two major Xcode versions. (excluding beta)
We got reports an older Xcode and older iPhone/iPad simulators combinations worked but a newer Xcode and the old iOS did not in some actions. (It depended on the app under test, so probably not all environment.) So we just kept the two major version policy for now so that users can try some combinations to avoid some issues depending on Xcode versions/iOS versions.
(Afaik, this is not for real iPhone devices so far.)

Potentially after Appium 2.0 release (perhaps in this year), we can change the policy more aggressively with getting user feedback since xcuitest-driver will be kindly independent from a central appium server. It means users can install older xcuitest-driver if they want without changing appium version itself.

I don't remember everything for now. Mykola may know of them than me, but we should take a look around when we drop older Xcode/iOS versions.

@KazuCocoa
Copy link
Member

As a baseline, I agree with your opinion to drop older Xcode (support latest) and iOS versions (ideally, latest 2 major iOS versions). But I also would like to respect users who support a bit older iOS versions as possible since I know of end-users (I mean users who use the app under test as end-users) still use older versions (I had experienced this situation) than latest 2 major versions. I hope they also get tested apps to their devices.

@KazuCocoa
Copy link
Member

KazuCocoa commented Jul 31, 2021

Btw, if you would like to build a specialized thin app for your use case with WDA architecture (client outside iOS <--> a server like WDA on a device <--> XCTest api etc) incompatible with WebDriver protocol/Appium/WebDriverAgent, I would suggest you to build it from scratch. A few friends of mine did so for their use case which did not follow WebDriver protocol/Appium/WebDeiverAgent. It restricted running environment.

@nanoscopic
Copy link

The problem I have is that, to my knowledge, only a single Xctest can be run on a device at a time. As a result, if I want to support device farm remote control and Appium support simultaneously, a single something ( or something supporting modules/plugins ) needs to be the one Xctest that is running.

Xctest functions must be called from the main thread of that Xctest also, so one cannot simply have different threads accepting different calls.

As a result, my suggested architecture for a cooperative future is that the core thing running Xctest functions just be a thin wrapper to the Xctest functions ( both private and public ), and it only listen via NNG ( inproc transport ). Seperate threads ( modular ) should then be used to support both WebDriver support and whatever else anyone needs or desires.

Those separate threads can then use modern architecture, plugins, updated HTTP, etc etc.

I'm still unclear what supporting an older release of xcode besides the current non-beta release accomplishes. I do know that some companies just want to stick with a version they know works, but generally one is forced to upgrade every release to be able to support the latest devices anyway.

I agree supporting latest 2 major version of iOS is reasonable; and I was actually saying that I also think supporting 2 previous major versions is reasonable, due to many actual users being on older iOS versions. They should, of course, upgrade, but they aren't forced to and don't have things like corporate policy to ensure they do.

At this point I don't see a purpose in supporting devices prior to iPhone 6s, but I do still have users wanting to use iPhone 6.

As for simulator interactivity, at least some of the actions can be changed to use the new methods researched and created into idb. They interact directly with reverse engineered simulator internals instead of using the official xctest functions ( where possible at least ).

I am still somewhat concerned with the use of private API functions, within my code, within WDA, within almost everything that wishes to reasonably automate iOS devices. Nobody speaks about it, but it is obvious that reverse engineering Apple code is necessary to develop and use such functions reasonably. As a result that goes against the Apple terms of service. I'd rather follow Apple policy; for my own benefit; as I require a developer account.

I've been posting publicly for Apple to reach out to me for years now with no response. I've messaged multiple people at Apple with no response. I think some concerted effort should be made by the community to demand some real answers from Apple. Apple should cooperate with the community working to support their devices instead of ignoring us. The best path forward would be for those working on Apple automation within Apple to cooperate with us to build streamlined and efficient paths for building open source solutions for automating their devices and software. It is extremely painful to build all of these things with zero support from Apple.

From what I can tell, it looks like there is some code support for remotely calling Xctest functions, but I've never been able to figure it out. If such a thing exists, it would be great if Apple could share how it works so that it isn't necessary to build undistributable developer only things such as WDA.

@KazuCocoa
Copy link
Member

I appreciate your explanation, thank you.
I see your device farm challenge.

only a single Xctest can be run on a device at a time

Do you mean testmanagerd thing? Then, yes.

For example, when you launch two WDAs as below:

tidevice wdaproxy -B com.kazucocoa1.WebDriverAgentRunner.xctrunner -p 8100
tidevice wdaproxy -B com.kazucocoa2.WebDriverAgentRunner.xctrunner -p 8101

Then, you can attach an appium session to kazucocoa1. Then you also can also issue some commands to kazucocoa2 via curl (for example). Commands via XCUIDevice.sharedDevice (e.g. tap a coordinate) can do while keeping the appium session to kazucocoa1. But each command via testmanagerd runs one by one in the device. (and they run on the main thread). The single xctest means this case, then yes. (I just toyed the above case on my local device)

@nanoscopic
Copy link

Curious. I never actually tried running two xctests at once through the alternative means. I assumed somehow testmanagerd doesn't support it. Thank you for trying it out and verifying it works. That significantly simplifies my path for the immediate future, as I don't have to merge my use case and WDA just yet.

I don't use Xcode to start tests, as it is terrible. Generally the path forward will be to use go-ios, as I am cooperating currently with its author, Daniel Paulus. If possible I would like also to cooperate with Appium and avoid duplicate effort / share research / plans to move forward and advance iOS device automation. The method tidevice uses to start tests does not match the normal "official" method used by Xcode. It does generally work but as I've said it is best to follow whatever Apple/Xcode does where possible to retain compatibility with future changes...

@KazuCocoa
Copy link
Member

https://github.com/danielpaulus/go-ios This? Yeah, I took a look at this project before. This was awesome!
(This conversation reminded me to write a page like "launch WDA and use webDriverAgentUrl caps" in appium.io and links like this go-ios, libimobiledevice etc)

it is best to follow whatever Apple/Xcode does where possible to retain compatibility with future changes...

Yes. I generally (strongly) agree with this point.
We're also very happy to share knowledge to move iOS automation forward for many persons.


Sorry my post moved aside from main touch events topic, but really appreciate, again.

@krishtoautomate
Copy link
Author

https://github.com/ArthurYidi/Bluetooth-Keyboard-Emulator

making it work might give more ideas to use this functionlaity of bluetooth interaction with touch-assist apis

@inluxc
Copy link

inluxc commented Aug 3, 2021

Hi there, i having the same problem with having a smooth swipe navigation in IOS.
WDA only have http requests, but it would help if it had WebSocket at at least for the swipe part. It would run a bit smoother.
Example:
https://drive.google.com/file/d/178GeFYRZi5qYD6ubgdalPCglQvlSKzrS/view

P.S: iOS support mouse connection, cant we simulate a mouse and send de right commands to it?

@krishtoautomate
Copy link
Author

Hi there, i having the same problem with having a smooth swipe navigation in IOS.
WDA only have http requests, but it would help if it had WebSocket at at least for the swipe part. It would run a bit smoother.
Example:
https://drive.google.com/file/d/178GeFYRZi5qYD6ubgdalPCglQvlSKzrS/view

P.S: iOS support mouse connection, cant we simulate a mouse and send de right commands to it?

Yes, their is delay with wda actions

@mykola-mokhnach
Copy link

I don't quite understand how having WebSocket would help there. It's the same data being sent over TCP, which means the latency remains the same. Yes, HTTP protocol has some overhead, but it is, probably, comparable or close to web socket perf with keep-alive enabled if not too many requests (we are talking about hundreds of them or more) are done within a short time period.

You could read https://github.com/appium/appium/blob/master/docs/en/writing-running-appium/ios/actions.md if you'd like to know more about the APIs provided by XCTest or in case you'd like to contribute to the project

@nanoscopic
Copy link

I agree with @mykola-mokhnach here. I did consider adding back WebSocket support into WDA myself, but I did not because the difference would be unnoticeable imo. I did also swap HTTP over to NNG completely in my fork of WDA. Doing that does make is very slightly more responsive I think ( I haven't actually measured it ), but the difference is no more than 100ms by my estimation.

That difference is from a fresh HTTP connection on every request to a persisted NNG connection.

If you do as directed, and pass the Keep-Alive header necessary due to the HTTP server in WDA being HTTP/1, you should get near the same level of improvement as I did using NNG. The overhead of processing HTTP protocol is likely less than 10ms. ( if that )

The example Google drive link is not public, so no one except those requesting access can see it...

It should be understood that, in the common example of "clicking an element" using WDA, WDA is doing the following:

  1. Make Xctest call to find the element desired
  2. Call tap via Xctest on that element

This is, at least, what happens on iOS 13+. Currently on lower versions it does something like this:

  1. Make Xctest call to find the element desired
  2. Potentially query the entire page structure ( 1-3 seconds )
  3. Calculate the position of element based on page structure
  4. Call synthesizeEvent via Xctest to click on it via absolute screen coordinates.

The 1-4 mechanism is quite slow, and is what I am discussing above...
The 1-2 mechanism ( just tap ) is only slightly slower than doing it through Xctest. The Xctest speed is slow, not due to the tap, but due to the finding of the element you want.

Clicking an absolute screen coordinate through synthesizeEvent in Xctest is fast. It is a bit slower ( 100ms added ) in WDA because the code is checking a bunch of things related to session. My modified fork of WDA has those checks removed for absolute screen clicks to do away with this extra and unnecessary 100ms. I have considered contributing this change, but I did not because clicking absolute screen coordinates is not something WDA / Appium are generally meant to do. I will be created my own new project as discussed above...

@nanoscopic
Copy link

@inluxc When simulating a swipe action through WDA, you should use touchPerform actions. Those directly translate to synthetic events via synthesizeEvent, which are themselves fast. The only delay in WDA is due to requiring a session to make the request, and is somewhere between 50ms and 100ms.

I do feel that WDA should be updated to not require a session for touchPerform to avoid this, but it is not my call to make.

@krishtoautomate
Copy link
Author

krishtoautomate commented Aug 3, 2021

@nanoscopic is it possible to perform any action without session? did you make it happen in your private wda?

if no session, how will you handle multiple devices in parallel?

@nanoscopic
Copy link

Yes you can perform actions without a session. All WDA does to make a session is launch an app and keep track of what that app is. It uses that XCUIApplication handle of the launched app to fetch XCUIElement elements and then perform actions on them.

The synthesizeEvent calls that are used on on XCUIDevice.sharedDevice, so they don't actually need a session at all. This means you can perform clicks and swipes without a session.

It does require modification to WDA to allow it though. I did all that previously in my fork of WDA. It makes the actions much faster to avoid the many layers within WDA. ( such as all the coordinate translation chaos; it is really not that hard to rotate some coordinates )

I have, though, created an entirely new thing that does what WDA does only better and faster. I have not yet released the code to the public though. It is currently closed sourced and integrated with the ControlFloor product I am producing and selling.

I am considering releasing the rewritten CFA as AGPL, and perhaps writing an Appium driver for it, but I am not yet sure that is the best action for sustaining future viability of ControlFloor. I'm not yet sure I trust the companies who ultimately "control" Appium yet to do what is right for the community.

I remain happy to explain things and share information whatever the case.

As for this ticket, my general understanding is that the main players in the industry are all using virtual bluetooth devices by way of bluez for actions that cannot be readily performed through Apple API. ( for both mouse and keyboard ) There are various cheap bluetooth dongles that could be used as well and I am working on making an easy path to use those...

Essentially the best you can do for mouse is to call synthesizeEvent as directly as possible reducing any extra overhead. It is pretty fast if you do that. The only problem is that you cannot perform "partial actions", such as letting the user draw on the screen with the mouse providing constant movement updates while they continue to drag holding the mouse button down. You can only record a sequence of actions and play them all back once the user releases the mouse button.

This is a serious deficiency in the XCUITest API, and after looking extensively ( even at the private API ) I see no way around it via XCTest.

TLDR: What WDA is doing currently is over-complicated and adds a ton of overhead. I can and did make it much better in my WDA fork. In the end I abandoned WDA and wrote a new similar thing to avoid the chaos/mess within WDA.

Possible paths forward for this ticket:

  1. Make a cleaner / more simplified call within WDA that matches more similarly what I'm doing in my fork. You can't just copy/paste my fork code since the license of my code doesn't match. ( it is in a new file... ) You can still look at it and write new code into WDA that works in a similar way... I won't be doing it though myself since I've moved on from WDA.

  2. Just close this ticket and accept that WDA is suboptimal. Not a great solution but likely?

  3. Convince me to stop delaying releasing CFA, and then abandon WDA entirely and cooperate on building an Appium driver for CFA. I like this idea, but unsure how much the community wishes to embrace this. It would make it possible to stop having to maintain the high complexity of WDA.

  4. Some mixture of above?

  5. Hope for some magical solutions beyond what I understand. I've spent years now looking at this stuff so I don't think there are any solutions beyond what I've described in various places and have worked hard on... but hey. I'm human and could very well be wrong / missing something / unaware of some better method that could be used.

@mykola-mokhnach
Copy link

@nanoscopic Thanks you for the detailed investigation. We are always open to new contributions as soon as they are made under Apache 2 license and don't break the backward compatibility. I would be happy to help you with code reviews and/or bugfixes, but as you mentioned above, I'm only one person and I don't get paid nothing get two USD per month as GitHub donations for this work.
Yes, the burden of backward compatibility prevents us from simplifying many things, but this is something we cannot drop easily. Too many people still depend on this functionality.

@nanoscopic
Copy link

@mykola-mokhnach It is a rather sad state of affairs that the main companies using/profiting off of Appium ( specifically device farm companies ) are not properly monetarily supporting WDA.

WDA is currently the only workable path for users automating iOS through Appium.

My interest in the WDA project at this point is only to share knowledge as I am able and to lead the community to a better solution.

I have zero care for supporting anyone using older Xcode. There is absolutely zero reason to do so in my mind. Even to those who don't have access to the latest Xcode, a prebuilt WDA could be provided to them and they could resign it using open source tooling ( even on Linux ).

Supporting older iOS versions and devices is of course important, but that can be done without issue using latest Xcode.

The cruft in WDA has to be abandoned in my opinion if there is to be any realistic path forward as Apple continues to make changes.

My plan currently to improve the situation:

  1. Write a full replacement for WDA. I've done this, but not yet released it as open source.
  2. Release that replacement under an acceptable license. I'm going to use AGPL, as I want to ensure that companies contribute back their changes instead of doing the uncooperative nonsense they've been doing so far, which is to fix issues and not share them back to the community.
  3. Write an Appium driver to it. I believe currently Appium is the leader for scripting solutions for iOS testing, so I think this is necessary if I realistically want to improve the situation the community is facing.

The problem is that there are multiple things needed to make a full "Appium driver". Xctest is insufficient. In my opinion the following are all needed:

  1. Xctest
  2. usbmuxd calls ( eg: such as what go-ios allows )
  3. Other video solutions ( such as a broadcast app )
  4. virtual bluetooth devices

I have written software that combines all of these ( 1-3 so far; working on 4 ). The question holding me back right now is how to continue to do it in a way where I can make a living off of it.

You've rightly pointed out that you don't get paid any meaningful amount for what you contribute. That is a problem. I don't want to end up in that same situation, so I can only release portions of my work that don't harm the business model that is currently keeping me employed working on this stuff.

I currently cooperate with LambdaTest.
I'd like to see Headspin and Sauce Labs both meaningful contribute both information and money to meaningfully make the necessary bits of stuff needed fully open to the community once and for all.

I call on Headspin and Sauce Labs to not hoard any methodology they have for automating iOS devices ( which they have both been doing for years ). Should these companies quite hiding the methods they are using and cooperate openly, then I would be willing to put more of my code into a usable license so all of us can progress together.

So long as Headspin and Sauce Labs continue to refuse to share, I don't see how the community can ever be healthy.

@krishtoautomate
Copy link
Author

@nanoscopic thanks for bringing this to attention.

I feel same about WDA, their is a place for improvement as calling apple xcuitest functions is faster compared to wda apis. Their might be un-necessary actions or processing in the code which need to be removed to increase speed.
Also why not convert it into pure swift instead of object-C.

@jlipps
Copy link
Member

jlipps commented Feb 8, 2022

@nanoscopic thanks for thinking about the state/future of ios automation and for wanting to work with appium in that. one question, based on what you said:

contribute back their changes instead of doing the uncooperative nonsense they've been doing so far, which is to fix issues and not share them back to the community.

I'm trying to understand; are you saying that there are companies out there that have forks of WDA with fixes they haven't upstreamed?

The reason I ask is that, at least on the surface of things, HeadSpin and Sauce Labs (the companies I'm most familiar with) have collectively contributed the equivalent of million(s) of dollars of development time and research to Appium, including iOS automation capabilities, not to mention funds paid to the open source foundation which supports Appium, marketing costs for Appium, etc... (On that note, I haven't noticed any contributions or involvement from LambdaTest, though I'm happy to be pointed to some, as people don't always share which companies they're contributing on behalf of...)

Or are you simply saying that some companies (like HeadSpin or Sauce), have done R&D related to iOS separately from WDA, which you believe could fruitfully be made a part of WDA, but these companies have chosen not to open source that technology? If that's the case, I can think of a variety of reasons:

  1. They are intentionally "hoarding" or "hiding" the technology.
  2. The people doing the iOS R&D don't know anything about Appium or what would be a useful contribution to Appium (and vice versa w.r.t the Appium developers).
  3. The technologies they've developed form what they believe is a core business advantage or differentiator for them.

And there's probably lots more options. It seems premature to characterize their actions as "hoarding". Every company (like yours, as it sounds like) has to decide for itself the line between proprietary software and software they contribute for free to the community. Most companies don't contribute anything at all to the community. Companies like HeadSpin and Sauce Labs contribute massive value to the Appium community. It may be that open-source-minded developers like you or me would prefer that these companies take more of their trade secrets and turn them into community contributions, but at the end of the day, that's a preference, and the best way to turn it into reality is via a persuasive argument.

As far as I can tell, the most persuasive argument goes something like this, assuming your premise about the technologies is correct:

  1. Device clouds rely heavily on Appium for their business
  2. Appium has a problem--WDA sucks and needs to be replaced
  3. Device clouds have the knowledge to replace WDA with something better, but this is currently tied up in proprietary code
  4. Device clouds can alleviate the suffering of their users by coming together, sharing proprietary code, and investing in a better iOS driver based on these formerly-proprietary techniques
  5. The overall benefit to device clouds of making Appium better in this way outweighs the cost of the lost trade secrets (it's much easier for DIY iOS clouds to exist, or competitors to emerge).

I think it's a good argument, but then again I have no way to satisfactorily evaluate #5 for any of these companies. It's a CTO-level decision, for sure. But the good (or bad, I guess) news is that, as soon as any one actor in the community opens up a secret, it becomes pointless for anyone else to maintain their own secret, if it's equivalent in behaviour. So all we need is for someone to come along who wants to contribute to Appium and for whom these magical iOS automation techniques are not core business differentiators. Will that ever happen? Who knows

In terms of the health of the Appium community, what I can say is that our biggest need in general is probably not the relinquishment of trade secrets, but the growth of our developer base. Having contributed to Appium on behalf of both HeadSpin and Sauce Labs, I can say that my contributions at least have never been dictated by what is or isn't a trade secret for those companies, but rather bounded by my available time, skill, and knowledge. I'm not personally competent at iOS reverse engineering, so I'm not going to be able to contribute to Appium in that way, but if I were, I would. It would be great if Appium had tons more developers, some of whom had these skills. If that were the case, we probably would have replaced WDA long ago. But operating with essentially a skeleton crew of non-experts with too many responsibilities, we are stuck with what we have. From this perspective, it's only HeadSpin and Sauce Labs that are keeping the Appium community remotely healthy, since these are the only companies that employ more-or-less full-time/part-time Appium devs. It would be great if all the other companies out there that were built on Appium would likewise contribute developers. In that new world, we'd have lots more time and capacity to do better iOS R&D, be proactive about staying up to date with iOS headers, etc...

@nanoscopic
Copy link

A selection of the facts from my point of view:

  1. Others and myself developed iOS support for STF as open source and we repeatedly did everything in our power to contribute that support to the STF project. Headspin, as controllers of the project, continually stonewalled and blocked this effort. This is a matter of public record and can be seen in my interactions in the STF project history.

  2. Headspin essentially abandoned the STF project and never did proper maintenance of it. This is clear in the lack of support for anything past Node 8.

  3. I spent a somewhat significant effort maintaining STF in that I moved the whole project to Node 12. As a result I am very intimate with the effort required to do so. It should have been done by Headspin. Headspin in no way supported my effort to do so. I never received a single dollar from Headspin for any of my work writing, as you call it, “million of dollars worth” of open source code.

  4. Headspin is amongst one of several companies who reverse engineered usbmuxd protocol and calls and then did not contribute any of the information to open source.

  5. Those who did contribute the necessary information to open source for usbmuxd ( specifically how to start xctests ) are exactly 3 parties:
    A. Myself
    B. Daniel Paulus
    C. Alibaba

  6. Headspin ticked off the main supporters of the STF project, which lead to them forking away from Headspin control to create DeviceFarmer. This was partly a result of the above points and their clear distaste for contributing anything to open source where they can avoid doing so.

  7. I approached Headspin leaders ( CEO and CTO ) after this fiasco, and had a meeting with them. In that meeting my request was simply to cooperate with them and resolve any tension. Their response was that they think my company and work are meaningless and they don’t care and have zero interest in cooperating. They then also proceeded to threaten me claiming I may be infringing their patents. I rightly told them to go fuck themselves in response, as their response is not the proper attitude of those embracing the open source arena.

  8. I continue to try to make peace with Headspin and get them to cooperate with the community; to no avail. Your response is clear demonstration of an unwillingness to actually face the truth and admit that Headspin has zero interest in supporting the community in any way that doesn’t directly make them money. It is about more than money by the way.

@nanoscopic
Copy link

nanoscopic commented Feb 9, 2022

@jlipps Despite the list and my soured attitude by the events of the past, I more of less agree with what you've said here.

Each company who does any work to make things work that are not obvious generally will not release that information unless there is some benefit to them by doing so.

This is why I am especially skeptical of companies releasing work to the public, as there is usually a selfish motivation to doing so, if only to have the community do testing and find bugs. Companies think "Hey we released ~ something ~ to the public, so we are validated in having done the right thing." They approach open source as a method to buy validation from the community rather than as wanting to actually be truly open and cooperative.

You've put my argument for why we should cooperate on iOS development work pretty accurately. It is that the majority of the once hidden things have now become publicly known information. There is no longer a significant market advantage to avoiding those things from being open source.

I am also approaching the entire thing from a different perspective than Appium. I'm looking at it purely from the standpoint of the market for and community around device farms. Testing is obviously the main use case for having a farm, but I am not looking at or thinking about the contributions to Appium itself but rather at the underlying technology that makes it possible.

A few of the key technological pieces:

  1. How to stream video
  2. How to automate touch / tap / swipe
  3. How to automate keyboard / text entry

There are a variety of mechanisms for doing all of these things, and I've spent years trying them all. My talk for the Appium conference was focused on this as well because I want all the methods to be publicly known and cooperated on.

These things are of critical importance to me because I don't think hiding how these things are done is of any benefit any more to the companies involved. It would be better if all of the companies who have an interest share this information freely so that we can work together to make all of them solid and work better for all offerings.

We can certainly continue down the path where only the companies who have invested the effort in making it work well get to benefit, but I don't see why that should be. It doesn't need to be this way. There is a lot more to the business of device farms than just the core tech. The core tech is certainly difficult to make work well, but I've already put much of the methodology into open source. I would like to continue and contribute all of it, and then fairly have the various companies contribute anything they might know about those bits as well.

If none of the device farm companies will contribute and cooperate, that puts me in the sad position of having to keep everything closed source as much as possible. I don't want that dark future. I want a happy future where we cooperate instead of arguing about what does and does not constitute a market advantage... or worse... how we should intentionally leave other companies in the dark to prevent them from competing with us.

@ZhouYixun
Copy link

ZhouYixun commented May 6, 2023

@nanoscopic Hello David, I really sympathize with your experience here. However, I have made some progress on Bluetooth mice recently. Would you be willing to collaborate on open source?

Unfortunately, currently only Linux systems are supported, and some Mac systems such as Monterey cannot connect to Bluetooth

@nanoscopic
Copy link

@ZhouYixun

The situation remains complex.

My last involvement with the WebDriverAgent project was when I stripped it down to remove the buggy caching and created https://github.com/nanoscopic/ControlFloorAgent

After I did that I entirely "rewrote" ( as in wrote a new thing that is similar ) WDA to create a CFA that is free from any original WDA IP. I did this because the complexity of WDA is entirely unneeded and I also wanted to own all of the IP related to my new project, ControlFloor. This new CFA project remains closed source. The question I am facing currently is what if any portions of ControlFloor to release publicly.

ControlFloor has not yet been sold to the public, and up till now I have only partnered with a few companies:

  1. LambatTest ( they were running a version of the closed source CFA, but decided internally to cut me out by having their engineers look at my closed source and rewrite it internally their own way to avoid paying me licensing )
  2. Snapchat ( the main client and licensee of ControlFloor over it's life )

The problem currently is that despite clear agreement to license and pay for support, Snapchat is now claiming to own some "interest in" ControlFloor and it's various pieces. It is utter nonsense, but I suspect they intend to sue me when I begin to sell CF to the public.

First, the relevant answer to your question, I will not be contributing to the WDA project itself any longer or ever, because I've already developed a much better replacement for it, CFA ( yet unreleased to the public ). I do want to release it to the public, but I am concerned that my main competitors will simply steal ( by reading and replicating the functionality of the code ) certain features in it that are great ( automation of the on-screen keyboard ).

I would like to release CFA to the public freely, but there are multiple concerns about doing so now:

  1. People just lifting the keyboard automation tech out of it. To address this I would have to either decide to freely give away this development which cost a fair amount to my company in R & D, or I would need to remove it from the free project.

  2. Snapchat claiming to own "interest in" CFA. As stated, it is nonsense and they would lose in a court case, but I am still concerned they will sue me and force me to go to the US and spend money on a lawyer to defend my company.

  3. CFA generally being "half" of the valuable IP of the CF project overall, which I would like to sell in order to make a living / pay my bills. My company has spent 3 years now developing CF, and if I release it all as open source I'm concerned no one will purchase support and I will not be able to pay my bills.

Despite all of these concerns, I still think the best thing to do is to release CFA as open source and just see what happens.

I think this is necessary because of the following:

  1. WebDriverAgent, in my view, is ancient and just needs to be replaced. It has too many hacky exceptions and has become very difficult to maintain and move to the latest private Xctest API released by Apple.

  2. Due to the hackiness of WDA, it has various bugs. These bugs have been worked around, but the workarounds make WDA MUCH slower than the non-hacky CFA I have created.

  3. I just generally want to support the community and provide the "newer better WDA like thing" to everyone freely. Regardless of whether this causes me to make less money on the project, I think it is the right thing to do.

  4. @jlipps seemed to express to me when we chatted that Appium may have some interest in utilizing CFA for Appium if I do release CFA under an actual proper "open source" license. ( as defined by OSI )

I am not opposed to any collaboration on whatever is already open source. The issue is that there is no currently usable open source project that I am involved with that would be the proper place to put "bluetooth mouse control". It wouldn't make sense to place it within WDA, as WDA runs on the device itself, and the bluetooth emulation needs to run on some external bit of hardware, or on a host machine itself. Your seem to have written it for Linux generally, but my question is for what set of bluetooth hardware? Have you released the source code for what you have done somewhere? I would like to see the code for what you have done so far.

I did write "stf-ios-support", but ever since STF changed their API the integration of iOS support that stf-ios-support provided got broken. As a result the project is dead for the moment unless someone updates it to work with the latest STF changes. I don't personally have any interest in doing that, as I already created a better replacement for STF entirely, which is the CF server code ( which was open source, but became closed source a year ago )

The last readable source version of CF can be seen here: https://github.com/nanoscopic/ControlFloor
The problem is that this version is buggy, and relatively unstable. After closed sourcing the project, many changes have been made to make it reliable. There are probably around 50 different bugs that have been fully corrected in the closed source version of CF compared to the open source version. I am not opposed to people continuing to develop the readable source version, but it seems pointless as I have a corrected version already.

That is, the "bluetooth mouse support" you are suggesting to collaborate on would best go into the CF project, but it is not currently open source ( or readable source ).

I am, though, currently considering releasing CF as readable source ( under my anti-corruption license once more ). Doing this would cause a lot of people to begin using CF, and it would be the optimal place for you to contribute bluetooth support if I decide to actual release it as such.

Hopefully this clarifies my current position to anyone who is curious about what is going on right now.

@ZhouYixun
Copy link

@nanoscopic

Thank you for David's reply. I have learned about your excellent project and your current difficulties. In my opinion, you may need to handle the risk of public disclosure, otherwise any other interested friends will be involved in it in the future.

At present, my project has not been released to the public because there are still many difficulties that have not been overcome. My idea is to simulate a Bluetooth device connecting to iOS on Linux/Windows/Mac, and then touch the iOS device by sending the Bluetooth mouse protocol.

@jlipps
Copy link
Member

jlipps commented May 8, 2023

@nanoscopic I'm sorry to hear about the continued legal/ethical troubles that seem to plague CFA. Sounds like a raw deal! Hopefully it will all be sorted out to your advantage or at least satisfaction. I'll just confirm that, yes, if at some point you release CFA under a permissive open source license, it would probably make a lot of sense for Appium to migrate to it, or at least (/initially) provide a separate driver for it, since it sounds like it is superior to WDA in many respects.

@ZhouYixun
Copy link

@nanoscopic I'm sorry to hear about the continued legal/ethical troubles that seem to plague CFA. Sounds like a raw deal! Hopefully it will all be sorted out to your advantage or at least satisfaction. I'll just confirm that, yes, if at some point you release CFA under a permissive open source license, it would probably make a lot of sense for Appium to migrate to it, or at least (/initially) provide a separate driver for it, since it sounds like it is superior to WDA in many respects.

This is also a concern for many open-source workers, and it may not be an easy task :(

@nanoscopic
Copy link

@jlipps I'm leaning heavily towards releasing CFA under my own "Anti-Corruption" license. I realize that is suboptimal and adoption would be much higher if it was released under something such as MIT, but I am simply not willing to hand the technology over to the megacorps to improve and never contribute back to.

The Anti-Corruption license should still be usable by most companies, as it is only restrictive in the sense of blocking direct competitors and megacorps. The reason it is still problematic is that many companies have a policy of whitelisting acceptable licenses, and many will likely balk at seeing a custom license.

As for ControlFloor itself my plan is to release the server portion of it as MIT license, so that development of an Android provider for it can be a community project and replace STF. I don't think building an Android provider for it using srccpy will be difficult; it just has not been a focus for me or my company to do yet.

For the moment I will keep the iOS provider of ControlFloor closed source, and wait and see what the results are with CFA. My aim here is to contribute as much as I am willing to be usable by the community and cooperate while still withholding some of the components for the moment to encourage people to actually pay for CF.

@ZhouYixun For the moment negotiation with Snap Inc continues. I am not terribly optimistic about them being reasonable or fair, but I am giving them the chance / benefit of a doubt to be reasonable before I engage in legal action against them. I've already reported them to the California Civil Rights Department, but I am unsure if they will take up the case or not. Intake for them takes some months, so I won't know if they believe it is worth pursuing for them or not for a while now.

There isn't any ethical question of ownership of everything Dry Ark LLC has created. I was told repeatedly in meetings over the entire course of working with Snap Inc that I / Dry Ark own all rights to the software. The issue is that they refused to note this in writing and are now lying saying they never said such. If they direct their employees to continue to lie under oath in a court case, I only then have limited evidence. There is evidence of the agreement, but not to the degree I'd like. The problem also is that I will need to expend not-insignificant cost with which to sue them myself if the CCRD doesn't consider it worth pursuing from their point of view.

There does remain an ethical question of whether WDA or CFA themselves are acceptable. I attempted to engage having Deutsche Telekom as a client but they refused on learning that reverse engineering was involved in the development process. I have reached out to Apple on this issue through many avenues but have yet to be able to get any official response or even discussion going with them. I will continue to pursue this though as I wish to resolve this concern so that myself, my company, and any users of CFA don't have to be concerned that Apple could be upset with anyone utilizing WDA or CFA. It seems it should be safe, as WDA and CFA both required the same amount of reverse engineering, and Apple never took action against WDA that I know of. It is just a concern I would like Apple to speak out on so that I don't continue to lose clients over this concern.

@jlipps
Copy link
Member

jlipps commented May 9, 2023

@nanoscopic do you have the text of this 'anti-corruption' license up anywhere I could look at?

@titusfortner
Copy link

Software Freedom Conservancy published a brilliant and contemplative evaluation of ethical software licensing last year - https://sfconservancy.org/blog/2022/mar/17/copyleft-ethical-source-putin-ukraine/ I don't think Selenium/Appium would be able to have a dependency on it as it would be a more restrictive license.

@nanoscopic I can think of a few ways that Selenium and/or Software Freedom Conservancy might be able to help, or maybe OpenJS Foundation. @shs96c might be able to allay Apple concerns or at least know who to talk to? Message me (Appium Slack, Selenium Slack, Twitter, LinkedIn, I'm not hard to find) if you want to discuss other options than just open sourcing CFA yourself.

@KazuCocoa KazuCocoa changed the title Enhancement request for touch events Enhancement request for touch events in XCUITest driver Aug 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants