Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mac os integration #3539

Closed

Conversation

Jbrocket
Copy link
Contributor

Proposed changes

Please describe your changes (e.g., what problems they attempt to solve, what results are expected, etc.) Additional motivation and context are welcome.
Please also mention relevant issues and pull requests as appropriate.

Post-change actions

Put an 'x' in the boxes that describe post-change actions that you have done.
The more 'x' ticked, the faster your changes are accepted by maintainers.

  • make test Run local tests prior to pushing.
  • make format Format source code to comply with lint policies. Note that some lint errors can only be resolved manually (e.g., Python)
  • make lint Run lint on source code prior to pushing.
  • Manual Update Did you update the manual to reflect your changes, if appropriate? This action should be done after your changes are approved but not merged.
  • Type Labels Select github labels for the type of this change: bug, enhancement, etc.
  • Product Labels Select github labels for the product affected: TaskVine, Makeflow, etc.
  • PR RTM Mark your PR as ready to merge.

Additional comments

This section is dedicated to changes that are ambitious or complex and require substantial discussions. Feel free to start the ball rolling.

@dthain
Copy link
Member

dthain commented Oct 12, 2023

Suggest that you base the Mac build on the actions in the build-conda workflow, so that you get all the necessary python dependencies. There may be other problems as well!

@Jbrocket
Copy link
Contributor Author

Tests that fail and some reasons I'm getting locally (or if it fails remotely that's mentioned too):

--- Testing makeflow/test/TR_makeflow_001_dirs_01.sh ... success 2s ## Fails Remotely
--- Testing makeflow/test/TR_makeflow_001_dirs_02.sh ... success 1s ## Fails Remotely
--- Testing makeflow/test/TR_makeflow_001_dirs_03.sh ... success 2s ## Fails Remotely
--- Testing makeflow/test/TR_makeflow_001_dirs_04.sh ... success 1s ## Fails Remotely
--- Testing makeflow/test/TR_makeflow_001_dirs_05.sh ... success 2s ## Fails Remotely
--- Testing makeflow/test/TR_makeflow_001_dirs_06.sh ... success 1s ## Fails Remotely
--- Testing makeflow/test/TR_makeflow_001_dirs_07.sh ... success 2s ## Fails Remotely
--- Testing makeflow/test/TR_makeflow_001_dirs_08.sh ... success 1s ## Fails Remotely
--- Testing makeflow/test/TR_makeflow_001_dirs_09.sh ... success 2s ## Fails Remotely
--- Testing makeflow/test/TR_makeflow_001_dirs_10.sh ... success 1s ## Fails Remotely
--- Testing makeflow/test/TR_makeflow_001_dirs_11.sh ... success 2s ## Fails Remotely
--- Testing makeflow/test/TR_starch_example.sh ... failure 1s ## Success Remotely, local failure due to missing binary

--- Testing work_queue/test/TR_work_queue_python.sh ... failure 17s ## Nonetype error happening “cctools/work_queue/test/wq_test.py", line 96

--- Testing taskvine/test/TR_vine_allocations.sh ... failure 31s ## vine_manager[53797] notice: rejecting worker (127.0.0.1:50479) as it uses protocol 4. The manager is using protocol 3.

--- Testing taskvine/test/TR_vine_python.sh ... failure 22s tasks “completes” (exits cleanly) but still remains in the background vine_manager[53999] notice: rejecting worker (127.0.0.1:50489) as it uses protocol 4. The manager is using protocol 3.
disconnected from manager localhost:1024
vine_worker: deleting workspace /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-54008
Worker completed.
mymac@mymacair3558 test %
Task did not complete in expected time.

--- Testing taskvine/test/TR_vine_python_futures.sh ... failure 6s ## Exception: Could not create manager on port 9123

--- Testing taskvine/test/TR_vine_python_no_serialization.sh ... failure 11s ## vine_manager[54629] notice: rejecting worker (127.0.0.1:50615) as it uses protocol 4. The manager is using protocol 3.
disconnected from manager localhost:9124
vine_worker: deleting workspace /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-54635
Worker completed.

--- Testing taskvine/test/TR_vine_python_serverless.sh ... failure 23s ## vine_manager[54745] notice: rejecting worker (127.0.0.1:50617) as it uses protocol 4. The manager is using protocol 3.
disconnected from manager localhost:1024
vine_worker: deleting workspace /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-54750
Worker completed.

--- Testing taskvine/test/TR_vine_python_tag.sh ... failure 22s ## Master is ready on port 1031
Running worker.
vine_worker: creating workspace /var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T//worker-501-56365
vine_worker: cleaning up cache directory /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-56365/cache
vine_worker: using 1 cores, 250 MB memory, 250 MB disk, 0 gpus
connected to manager localhost:1031 via local address 127.0.0.1:50910
2023/11/14 13:17:19.59 vine_manager[56359] notice: rejecting worker (127.0.0.1:50910) as it uses protocol 4. The manager is using protocol 3.
disconnected from manager localhost:1031
vine_worker: deleting workspace /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-56365
Worker completed

--- Testing taskvine/test/TR_vine_python_task.sh ... failure 11s ## listening on port 9126
submitting tasks...
Waiting for manager to be ready.
Master is ready on port 9126
Running worker.
vine_worker: creating workspace /var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T//worker-501-56476
vine_worker: cleaning up cache directory /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-56476/cache
vine_worker: using 1 cores, 250 MB memory, 250 MB disk, 0 gpus
connected to manager localhost:9126 via local address 127.0.0.1:50912
2023/11/14 13:17:56.32 vine_manager[56458] notice: rejecting worker (127.0.0.1:50912) as it uses protocol 4. The manager is using protocol 3.
disconnected from manager localhost:9126
vine_worker: deleting workspace /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-56476
Worker completed.
(cctools-build) jon

--- Testing taskvine/test/TR_vine_python_temp_files.sh ... failure 12s ## listening on port 9127
Waiting for manager to be ready.
Master is ready on port 9127
Running worker.
vine_worker: creating workspace /var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T//worker-501-56588
vine_worker: cleaning up cache directory /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-56588/cache
vine_worker: using 1 cores, 250 MB memory, 250 MB disk, 0 gpus
connected to manager localhost:9127 via local address 127.0.0.1:50924
2023/11/14 13:19:03.88 vine_manager[56583] notice: rejecting worker (127.0.0.1:50924) as it uses protocol 4. The manager is using protocol 3.
disconnected from manager localhost:9127
vine_worker: deleting workspace /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-56588
Worker completed.

@Jbrocket
Copy link
Contributor Author

Checks haven't passed, I'm manually testing in the pipeline.

@Jbrocket
Copy link
Contributor Author

Notes:

  • Makeflow dirs cases now working.
  • Workflow is setup easily now since it pulls from latest branch and you can test singular files with testing_singular under packaging/mac-os/testing_singular.
  • Use echos to track how files are working

Cases still failing (mainly in taskvine):
--- Testing work_queue/test/TR_work_queue_python.sh ... failure 17s ## Nonetype error happening “cctools/work_queue/test/wq_test.py", line 96

--- Testing taskvine/test/TR_vine_allocations.sh ... failure 31s ## vine_manager[53797] notice: rejecting worker (127.0.0.1:50479) as it uses protocol 4. The manager is using protocol 3.

--- Testing taskvine/test/TR_vine_python.sh ... failure 22s tasks “completes” (exits cleanly) but still remains in the background vine_manager[53999] notice: rejecting worker (127.0.0.1:50489) as it uses protocol 4. The manager is using protocol 3.
disconnected from manager localhost:1024
vine_worker: deleting workspace /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-54008
Worker completed.
mymac@mymacair3558 test %
Task did not complete in expected time.

--- Testing taskvine/test/TR_vine_python_futures.sh ... failure 6s ## Exception: Could not create manager on port 9123

--- Testing taskvine/test/TR_vine_python_no_serialization.sh ... failure 11s ## vine_manager[54629] notice: rejecting worker (127.0.0.1:50615) as it uses protocol 4. The manager is using protocol 3.
disconnected from manager localhost:9124
vine_worker: deleting workspace /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-54635
Worker completed.

--- Testing taskvine/test/TR_vine_python_serverless.sh ... failure 23s ## vine_manager[54745] notice: rejecting worker (127.0.0.1:50617) as it uses protocol 4. The manager is using protocol 3.
disconnected from manager localhost:1024
vine_worker: deleting workspace /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-54750
Worker completed.

--- Testing taskvine/test/TR_vine_python_tag.sh ... failure 22s ## Master is ready on port 1031
Running worker.
vine_worker: creating workspace /var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T//worker-501-56365
vine_worker: cleaning up cache directory /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-56365/cache
vine_worker: using 1 cores, 250 MB memory, 250 MB disk, 0 gpus
connected to manager localhost:1031 via local address 127.0.0.1:50910
2023/11/14 13:17:19.59 vine_manager[56359] notice: rejecting worker (127.0.0.1:50910) as it uses protocol 4. The manager is using protocol 3.
disconnected from manager localhost:1031
vine_worker: deleting workspace /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-56365
Worker completed

--- Testing taskvine/test/TR_vine_python_task.sh ... failure 11s ## listening on port 9126
submitting tasks...
Waiting for manager to be ready.
Master is ready on port 9126
Running worker.
vine_worker: creating workspace /var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T//worker-501-56476
vine_worker: cleaning up cache directory /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-56476/cache
vine_worker: using 1 cores, 250 MB memory, 250 MB disk, 0 gpus
connected to manager localhost:9126 via local address 127.0.0.1:50912
2023/11/14 13:17:56.32 vine_manager[56458] notice: rejecting worker (127.0.0.1:50912) as it uses protocol 4. The manager is using protocol 3.
disconnected from manager localhost:9126
vine_worker: deleting workspace /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-56476
Worker completed.
(cctools-build) jon

--- Testing taskvine/test/TR_vine_python_temp_files.sh ... failure 12s ## listening on port 9127
Waiting for manager to be ready.
Master is ready on port 9127
Running worker.
vine_worker: creating workspace /var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T//worker-501-56588
vine_worker: cleaning up cache directory /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-56588/cache
vine_worker: using 1 cores, 250 MB memory, 250 MB disk, 0 gpus
connected to manager localhost:9127 via local address 127.0.0.1:50924
2023/11/14 13:19:03.88 vine_manager[56583] notice: rejecting worker (127.0.0.1:50924) as it uses protocol 4. The manager is using protocol 3.
disconnected from manager localhost:9127
vine_worker: deleting workspace /private/var/folders/m2/qh4pkmwn18l34x82556l_1t40000gn/T/worker-501-56588
Worker completed.

@dthain
Copy link
Member

dthain commented Dec 18, 2023

Most of these failures are due to some setup problem where a vine_worker in your PATH is somehow overriding the one to be tested. That's what causes this error:

2023/11/14 13:19:03.88 vine_manager[56583] notice: rejecting worker (127.0.0.1:50924) as it uses protocol 4. The manager is using protocol 3.

@dthain
Copy link
Member

dthain commented Dec 19, 2023

Superseded by #3594

@dthain dthain closed this Dec 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants