OpenMinTeD SSH UC Hackathon #6

reckart · 2018-04-08T17:40:03Z

I have deployed a component and tried to run it on the platform. The result of the operation is listed as "FAILED", but I have no idea why. How can one get access to the log output?

Instance: test.openminted.eu

galanisd · 2018-04-08T20:57:05Z

I have deployed a component and tried to run it on the platform.

For an application you can directly run it after you registration. If it is a component this is not possible.

reckart · 2018-04-08T21:08:48Z

For an application you can directly run it after you registration. If it is a component this is not possible.

I know. I have built a workflow which makes use of the component that I had deployed (cf. : #7)

omtdImporter
PdfReader
OpenNlpSegmenter
VariableMentionDisambiguator

reckart · 2018-04-08T21:54:11Z

FYI @azielinskiACC

galanisd · 2018-04-08T22:22:59Z

OK
I had a look into Galaxy.
VariableMentionDisambiguator is a UIMA component
with the following coordinates

eu.openminted.uc-tdm-socialsciences
ss-variable-detection
1.0.1-SNAPSHOT

It is available on
Maven Central ?
zoidberg public snapshots?
OMTD repo? -> the executor that we have does not look there.

Also the workflow is created in OMTD Workflow Editor instance of Galaxy. Then OMTD Registry
copies it OMTD Workflow Execution instance of Galaxy. Do you know the name of the workflow
so I can check if it is there?

reckart · 2018-04-08T22:27:56Z

OMTD repo? -> the executor that we have does not look there.

It is in the OMTD SNAPSHOTs repo. The registry seems to be able to resolve artifacts from there. Would it be possible to ensure that the executors and the registry use the same sets of repos to look up components, best also in the same order.

The workflow URL is: https://test.openminted.eu/landingPage/application/c58d1986-690e-40b9-b408-f649443c7d33

galanisd · 2018-04-08T22:50:47Z

It is in the OMTD SNAPSHOTs repo. The registry seems to be able to resolve artifacts from there. Would it be possible to ensure that the executors and the registry use the same sets of repos to look up components, best also in the same order.

Until now it was not required. Added it on my TO-DO list.

The workflow URL is: https://test.openminted.eu/landingPage/application/c58d1986-690e-40b9-b408-f649443c7d33

Downloaded the metadata record from Registry (attached). The workflow name is
[email protected] 13865a76-613b-475a-88bf-4af5357b9263

I downloaded it from Galaxy executor (also attached). It is empty, no steps. Probably this is why
it fails. It seems a Registry issue.

rec.zip

reckart · 2018-04-08T22:57:11Z

I'll try building a new one.

galanisd · 2018-04-08T23:01:18Z

Ok. Please sent me the landing page as you did with previous one. I will download the metadata record
find the Galaxy workflow and check if it is OK. If it is not we have to inform Antonis.

reckart · 2018-04-08T23:13:30Z

Ok, I have created a new one. This time, it is not empty when I re-open it in the workflow editor:

https://test.openminted.eu/landingPage/application/89d5e9ea-32fb-45f7-bf00-1fe466e33c4f

However, it still fails:

@azielinskiACC @galanisd note that I have pasted a full multi-line XML file into the parameter variableSpecification - not sure if that could cause a problem. Aside from the XML getting a bit sqashed down when pasting it into the input field, it seemed ok in the Galaxy editor.

<?xml version="1.0" encoding="UTF-8"?>
<variables>
   <variable v_id="140" correct="YesNo">
       <v_label>INGLEHART-INDEX </v_label>
       <v_topic>Political attitudes and participation</v_topic>
       <v_question> What are your political priorities? </v_question>
       <v_subquestion> </v_subquestion>
       <v_answer a_id="1">Postmaterialist</v_answer>
       <v_answer a_id="2">Postmaterialist mixed-type</v_answer>
       <v_answer a_id="3">Materialist mixed-type</v_answer>
       <v_answer a_id="4">Materialist</v_answer>
       <v_answer a_id="5">Don't know</v_answer>
       <v_answer a_id="99">No answer</v_answer>
   </variable>
</variables>

The other thing is that the component should try to download a model from the OMTD Maven repo. That means it must have network access to that repo.

	<groupId>eu.openminted.uc-tdm-socialsciences</groupId>
	<artifactId>ss-variable-detection-model-disambiguation-en-ss</artifactId>
	<version>20180406.1</version>

Hm... that said, it might actually try to download the model from the wrong repo (i.e. the DKPro Core repo instead of the OMTD repo...). That is something I need to look into locally.

reckart · 2018-04-08T23:22:59Z

Opened an issue regarding model-auto-downloads here: openminted/omtd-component-executor#1

galanisd · 2018-04-08T23:36:49Z

Yes now it not empty.
The workflow is this [email protected] 3c6c03b5-9a04-41bb-996a-a2cd536c7ace

I see a the following error in the logs workflow-service which is the module that call Galaxy.

--- [ Thread-625] e.o.w.service.WorkflowServiceImpl : Unable to locate workflow: 0931730980607790%40openminted.eu+3c6c03b5-9a04-41bb-996a-a2cd536c7ace

Maybe it has to do with the name of the workflow. It contains spaces and a "@" which are escaped at some point.
@courado @greenwoodma @antleb

reckart · 2018-04-09T01:48:47Z

Ok. I have:

fixed a bug in the OMTD-SHARE Maven plugin causing descriptors to be missing from the JARs
changed the scope of the model from "test" to "compile" to avoid the component having to auto-download it
uploaded the variable disambiguator component again; as far as I can tell the registry really picked up the latest SNAPSHOT version (https://test.openminted.eu/landingPage/component/104d0ca9-8d4f-4719-a337-9ecaa0a331e3)
built a new workflow (https://test.openminted.eu/landingPage/application/60b9b72d-3e6d-4e5e-906d-766978cdf7c4)
this time ensuring that I set the language "en" in the PdfReader so that the variable disambiguator component knows which model to use (language "en", variant "ss")

Then I tried running the workflow again on the variable test corpus that @azielinskiACC has published on the platform.

Still, I get a failure again.

Any idea what could be the reason now?

galanisd · 2018-04-09T07:22:29Z

I assume that again the workflow-service fails to call the workflow that was created @ Galaxy executor. As I said above probably the reason is the name of the workflow.

greenwoodma · 2018-04-09T17:25:28Z

I've just pushed a fix for this that should URL decode the workflow name before looking for it in Galaxy. This should get built and pushed to beta automatically but won't end up on test until someone manually pulls in the latest workflow service code.

courado · 2018-04-11T14:39:27Z

I have also added the error message supplied from the workflow service under the My Operations page

reckart · 2018-04-11T14:46:40Z

@courado great! :)

I just tried running the workflow again, but it fails being unable to locate the named workflow.

Could somebody please push @greenwoodma `s fix to test.openminted.eu?

greenwoodma · 2018-04-11T14:51:53Z

@reckart is it not possible to rename the workflow to avoid the bug until the fix is pushed to test?

reckart · 2018-04-11T14:52:51Z

@greenwoodma how do I do that? The workflow editor only has a "save" button, not a "rename" or "save as" button as far as I remember.

galanisd · 2018-04-11T14:57:21Z

I think that the only way to do that is

a. rename the workflow in Galaxy
b. download the metadata record of the app. delete it from the registry
c. upload an updated metadata record with the new workflow name.

greenwoodma · 2018-04-11T14:58:47Z

@reckart hmmm I thought the name of the workflow came from the name you gave the app in the registry UI, but maybe not, or maybe you can't change it there either. Certainly the workflow editor just gets passed the name from the platform it doesn't generate it.

reckart · 2018-04-11T15:03:18Z

Well, the name I have given to the workflow in the registry UI is "Simple Variable Disambiguation Example (English)". [email protected] 3c6c03b5-9a04-41bb-996a-a2cd536c7ace looks like an auto-generated ID over which I probably do not have control. My guess would be that it is a representation of the user-id concatenated with some other ID...

greenwoodma · 2018-04-11T15:06:14Z

What's weird is that if all workflow IDs are generated the same way then how have we ever run a workflow as we'd have hit this issue every time? I'm seriously confused by this one.

reckart · 2018-04-11T15:14:09Z

Apparently one can edit the workflow name in Galaxy by clicking on the pre-generated name, entering a new value and pressing ENTER. I did that (see screenshot).

However, when I press "save" now, nothing happens. Odd...

Ok, when I go back to "My applications" and re-open the workflow in the editor, I can see that the name I put is still there, so I guess the "save" must have worked.

I wonder what happens if I created a second workflow by the same name...

Anyway, running the now re-named workflow still gives me the same message:

Failed 
Unable to locate named workflow

@courado the "My operations" view has a date, but not a time stamp. It would be great if we could also see the submission and possibly completion times of the execution there.

galanisd · 2018-04-11T15:27:41Z

@greenwoodma

Workflow names @ Galaxy are not generated with the same way.

Some test Galaxy workflows were named by me (manually). I am calling them programmatically in our tests (via workflow-service).
The Galaxy workflows that are created automatically from OMTD Registry and correspond to ready to use OMTD applications (e.g. Chebi app) seem to have valid names.
The applications that are created in Galaxy editor and then ingested in OMTD Registry seem to have this problem.
@courado Please have a look on it.

Also workflow ID is a different thing that workflow name. For each workflow name there is an internal unique workflow ID; the one you retrieve in workflow-service from Galaxy so that you initiate a workflow execution.

reckart · 2018-04-19T00:22:23Z

Btw. I have also registered the Keyword Assignment component now and try to run it on a single document corpus. This comment is mainly for documenting when I started it since this info is not shown in the operations screen. The pipeline is even more minimal than the disambiguation pipeline (no segmenter needed).

@azielinskiACC

greenwoodma · 2018-04-19T08:56:43Z

So I've had a look at this issue of workflows running for ever and I think I've found the problem. I've just pushed a couple of fixes to the workflow service which should appear on beta shortly (not quite sure when they'll get pushed to test).

If you want the details read on.......

Essentially when a workflow runs we watch to see when the final step reaches the ok state (both the step and the underlying job). Unfortunately if an error occurs when running the workflow while this is captured and stored within the workflow service there wasn't an exception associated with the error (no exception was thrown as the error comes from checking the state not an exception). So while the internal object used for tracking progress within the workflow service recorded the failed state there was a problem when it came to communicating this to the registry. The JMS message doesn't contain a flag signifying the state of the workflow what it contains is an error field which should be filled with a message when an error occurs. The code in the workflow service filled this in using the message from the exception which had put the workflow into the failed state. Unfortunately in the case of a workflow failing because galaxy reported a state being in error there was no exception and so no message was returned. As such, while the workflow service knew that the workflow had failed the registry assumed it was still running and just sat there waiting for the next message from the workflow service which would never arrive. The fix involves never putting the internal object into the failed state without an associated exception, which means there is now always an error message (hopefully a useful one) which will be passed back to the registry.

@reckart I'm guessing your workflows are stuck in this situation. If you could send me the unique ID of the workflow (this is the long alphanumeric sequence next to the words "Workflow Canvas" at the top of the editor screen) then I can double check just to be certain. It won't help with working out why they failed, for that I'd need to look at the logs for the workflow service I think. @galanisd can you remind me the IP of the machine running the test instance of the workflow service?

galanisd · 2018-04-19T11:41:19Z

@reckart

I created the same workflow with you; the Variable Dis. component is available in the Workflow editor.
I retested. Steps 1,2,3 were Ok...output as expected (checked Galaxy).

The Variable Dis. component fails while trying to download

de.tudarmstadt.ukp.dkpro.core#de.tudarmstadt.ukp.dkpro.core.variable-detection-model-disambiguation-en-default

part of the log attached.
log.zip

Locally in my laptop I do not have the same issue. I am trying to understand why...

greenwoodma · 2018-04-19T11:46:00Z

Would appear that the artifact isn't in any of the repos we look in.

galanisd · 2018-04-19T12:05:56Z

The 3 last steps of the workflow are DKPro UIMA components.

PDFReader -> no model required
OpenNLPSegmenter -> requires a model that was downloaded without issues.
e.g.
2018-04-19 11:52:32.555 INFO 50 --- [ main] d.t.u.d.c.a.r.ResourceObjectProviderBase : downloading http://zoidberg.ukp.informatik.tu-darmstadt.de/artifactory/public-model-releases-local/de/tudarmstadt/ukp/dkpro/core/de.tudarmstadt.ukp.dkpro.core.opennlp-model-token-en-maxent/20120616.1/de.tudarmstadt.ukp.dkpro.core.opennlp-model-token-en-maxent-20120616.1.jar ...
2018-04-19 11:52:32.896 INFO 50 --- [ main] d.t.u.d.c.a.r.ResourceObjectProviderBase : [SUCCESSFUL ] de.tudarmstadt.ukp.dkpro.core#de.tudarmstadt.ukp.dkpro.core.opennlp-model-token-en-maxent;20120616.1!de.tudarmstadt.ukp.dkpro.core.opennlp-model-token-en-maxent.jar (340ms)
Variable Dis. -> requires a model but fails to downloaded it.

Hmmm...

reckart · 2018-04-19T12:07:48Z

The model that the VarDis is using should be in the same repo as VarDis itself - however, in according to the logs, it tries to download the "default" variant, not the "ss" variant. I'm trying to check the workflow config again.

galanisd · 2018-04-19T12:10:09Z

I am checking the configuration for the repos in my laptop. I deleted the model but the when I run the script it is downloaded...

galanisd · 2018-04-19T12:19:47Z

I was using modelLocation not modelVariant.
Corrected. I am retesting right now.

reckart · 2018-04-19T12:22:21Z

The models are here: https://repo.openminted.eu/content/repositories/releases/eu/openminted/uc-tdm-socialsciences/

reckart · 2018-04-19T12:27:16Z

@reckart I'm guessing your workflows are stuck in this situation. If you could send me the unique ID of the workflow (this is the long alphanumeric sequence next to the words "Workflow Canvas" at the top of the editor screen) then I can double check just to be certain.

There are at least two ones stuck with jobs:

0931730980607790-9b9b1d64-fe3f-4de7-88ca-bf1f788e60f5
0931730980607790-bcdd2736-498c-48e9-b61d-352a043e7175

Btw. I can still edit the workflow name in the workflow editor.

galanisd · 2018-04-19T12:38:48Z

Got results.... :-)

Attached...

bc4e4776-cc9c-47d1-bf28-0d9b5ab78c46.zip

I hope that is not an illusion...

Still I have to check what happens with the repo configuration even though it seems to work right now.
I do not understand why your workflow never completed.

reckart · 2018-04-19T13:09:58Z

@galanisd great news!!!

For curiosity: does it open in the Annotation Viewer?

galanisd · 2018-04-19T13:19:58Z

Nope ..... I think because the results are written in an "output" and not in an "annotations" folder.
This happens because currently the metadata of the component are not passed to the workflow-service.

I might be wrong.
@greenwoodma @antleb @courado ?

greenwoodma · 2018-04-19T13:25:13Z

Yes, there is a redmine issue https://redmine.openminted.eu/issues/767 which I've just bumped.

azielinskiACC · 2018-04-19T13:31:51Z

So, finally. That's great.
Was it possible to use a configuration file?
For NER I use the following https://test.openminted.eu/landingPage/application/2d3fc2aa-6f9b-4a5b-bd75-763a39b8b18b
Correct?

reckart · 2018-04-19T13:48:05Z

@azielinskiACC I cannot access the link above. Probably it is a private workflow in your account?

galanisd · 2018-04-19T14:29:33Z

I can ...

however this metadata record seems to be for an image that I have created 10 months ago...
(Identifiers OMTD: DemoWF3SSHNER)

Back then there were no docker specs and we have create 5 apps (one of them was NER) in order

to do some demos
to experiment with Mesos/Galaxy and see what is required.

The respective image us not OMTD compliant; i.e. it does not follow the docker spec and it will not be executed in the current environment.

See also ...
#1

Who is working on this image/app?

reckart · 2018-04-19T14:31:15Z

I'd have to look into the NER thing.

galanisd · 2018-04-19T14:31:35Z

If required please open a new issue (NER Hackathon)

azielinskiACC · 2018-04-19T14:46:58Z

For testing, it would be great to have the proper landing ID for all SS-A applications, since search on the OpenMinted Platform does not give any results.
Unfortunately, there are some 'empty' corpora I created and cannot be deleted and which might cause confusion (A known issue?) So please also let me know which data input files (=landing ID) I should use.

reckart · 2018-04-19T14:50:08Z

@azielinskiACC @galanisd since the "test.openminted.eu" platform is only for testing and may be reset again... does it make sense at all to use fixed IDs for corpora? Maybe better to have people upload own data or build a corpus using the search functionality.

The names of the SSH components on the other hand are rather stable. I'll run a release and then could publish them to the main platform (non-test).

pennyl67 · 2018-04-19T15:00:14Z

@reckart Please note that @antleb is currently updating the main platform (services), so I wouldn't recommend adding anything there until we get notified. The idea is to use the services for all the testing etc., so it must be updated with all the fixes that the test platform has now.

pennyl67 · 2018-04-19T15:01:10Z

Sorry, by "testing" I meant the evaluation of the tenders/hackathon

greenwoodma · 2018-04-19T15:03:51Z

@pennyl67 is @antleb updating services to the same as test is currently or to the latest version of the code? The plan was to update test daily since the WP7 call last week, but the workflow-service hasn't been updated in the last week so it's still not got all the bug fixes we've made this week (which is quite a few). The problem is that while I think those fixes all work as expected, I'd assumed they were being tested on test as that was being updated. Now I find it hasn't been, so it may be that we get an up to date services which is buggier than test.

reckart · 2018-04-19T15:05:20Z

@pennyl67 @azielinskiACC @galanisd @antleb WP9 also has to wrap up the "tutorial" material. My comment was related to what we can expect to later be able to find on the main platform and what not. Things we can find on the main platform can be in the tutorial, others maybe not. So IMHO it would make more sense to include the building of the corpus/uploading of documents into the tutorial, also the building of a workflow, but not the uploading of the components - we should be able to expect that the components we test now will exist on the main platform. Makes sense?

pennyl67 · 2018-04-19T15:09:47Z

@greenwoodma I'm not sure - I didn't know this detail; trying to find out and i'll let you know
@reckart I've asked @antleb to not delete from the main platform anything before I check; the problem is that we need to clean up various test resources and empty corpora. And I will send out an email to all to check which resources they want us to keep.
And yes, I understand what you're saying for the tutorials (everything happening at the same time!) - but how can you build a workflow without the components in the same platform? I can understand for the built/updated corpora.

reckart · 2018-04-19T15:17:46Z

We publish the SSH UC components to the test platform now for the preparation of the tutorial material. The same components will later be published to the main platform - once the main platform is ready.

reckart · 2018-04-20T12:03:47Z

My understanding is that for the VarDis and Keyword components, we are good now.

@azielinskiACC - do you agree? If yes, I would suggest to close this issue.

azielinskiACC · 2018-04-20T15:19:35Z

Yes - both components are running and the issue can be closed. Am Fr., 20. Apr. 2018 um 14:03 Uhr schrieb Richard Eckart de Castilho < [email protected]>:

…

My understanding is that for the VarDis and Keyword components, we are good now. @azielinskiACC <https://github.com/azielinskiACC> - do you agree? If yes, I would suggest to close this issue. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#6 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AdBhH0sEYLsANCs_gPsPSINBgQ6ecXfSks5tqc6jgaJpZM4TLoJ8> .

reckart mentioned this issue Apr 8, 2018

Unable to edit workflow #7

Closed

reckart assigned antleb, greenwoodma and galanisd Apr 8, 2018

galanisd assigned galanisd and unassigned galanisd Apr 8, 2018

reckart closed this as completed Apr 20, 2018

OpenMinTeD SSH UC Hackathon #6

OpenMinTeD SSH UC Hackathon #6

Comments

reckart commented Apr 8, 2018 • edited Loading

galanisd commented Apr 8, 2018

reckart commented Apr 8, 2018

reckart commented Apr 8, 2018

galanisd commented Apr 8, 2018

reckart commented Apr 8, 2018

galanisd commented Apr 8, 2018

reckart commented Apr 8, 2018

galanisd commented Apr 8, 2018

reckart commented Apr 8, 2018

reckart commented Apr 8, 2018

galanisd commented Apr 8, 2018

reckart commented Apr 9, 2018 • edited Loading

galanisd commented Apr 9, 2018

greenwoodma commented Apr 9, 2018

courado commented Apr 11, 2018

reckart commented Apr 11, 2018

greenwoodma commented Apr 11, 2018

reckart commented Apr 11, 2018

galanisd commented Apr 11, 2018

greenwoodma commented Apr 11, 2018

reckart commented Apr 11, 2018 • edited Loading

greenwoodma commented Apr 11, 2018

reckart commented Apr 11, 2018 • edited Loading

galanisd commented Apr 11, 2018

reckart commented Apr 19, 2018

greenwoodma commented Apr 19, 2018

galanisd commented Apr 19, 2018

greenwoodma commented Apr 19, 2018

galanisd commented Apr 19, 2018

reckart commented Apr 19, 2018

galanisd commented Apr 19, 2018

galanisd commented Apr 19, 2018

reckart commented Apr 19, 2018

reckart commented Apr 19, 2018

galanisd commented Apr 19, 2018

reckart commented Apr 19, 2018

galanisd commented Apr 19, 2018

greenwoodma commented Apr 19, 2018

azielinskiACC commented Apr 19, 2018

reckart commented Apr 19, 2018

galanisd commented Apr 19, 2018

reckart commented Apr 19, 2018

galanisd commented Apr 19, 2018

azielinskiACC commented Apr 19, 2018

reckart commented Apr 19, 2018

pennyl67 commented Apr 19, 2018

pennyl67 commented Apr 19, 2018

greenwoodma commented Apr 19, 2018

reckart commented Apr 19, 2018 • edited Loading

pennyl67 commented Apr 19, 2018

reckart commented Apr 19, 2018

reckart commented Apr 20, 2018

azielinskiACC commented Apr 20, 2018 via email

reckart commented Apr 8, 2018 •

edited

Loading

reckart commented Apr 9, 2018 •

edited

Loading

reckart commented Apr 11, 2018 •

edited

Loading

reckart commented Apr 11, 2018 •

edited

Loading

reckart commented Apr 19, 2018 •

edited

Loading