Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot start Heritrix with Windows 10 #398

Closed
machawk1 opened this issue Mar 13, 2019 · 9 comments
Closed

Cannot start Heritrix with Windows 10 #398

machawk1 opened this issue Mar 13, 2019 · 9 comments

Comments

@machawk1
Copy link
Owner

Related to #155 but for Windows.

I compiled WAIL on Windows 10 from fc82057 then installed the latest Java from Oracle manually. When selecting the FIX heritrix button, the console reports that it will be opened in a new Windows.

The new window displays this:
windows_norun

Java version 1.8.0_201
Java SE Runtime Environment (build 1.8.0_201-b09)
Java HotSpot Client VM (build 25.201-b09, mixed mode)

@machawk1
Copy link
Owner Author

Also relevant, fc82057 uses Heritrix 3.2.0 and per #345, a new version if available but is not a drop-in replacement per discussion in #345. We may want to replicate the commands to Heritrix that WAIL issues with an isolated version of Heritrix run on Windows 10 to verify compatibility between Heritrix and a newer version of Java than expected on macOS.

@machawk1
Copy link
Owner Author

See https://webarchive.jira.com/browse/HER-2085, per internetarchive/heritrix3#129 this might be fixed by updating Heritrix as in #345.

@machawk1
Copy link
Owner Author

To Get It Working™, let's bundle the Java 1.7 JDK like we have on macOS. If I recall, this was one barrier in Heritrix -- the Java security APIs that are used in Heritrix were removed in Java 1.8.

@machawk1 machawk1 pinned this issue May 21, 2019
machawk1 added a commit that referenced this issue May 21, 2019
@machawk1
Copy link
Owner Author

With Java 7 SDK bundled for Windows, some of the Java checks are unnecessary. Unfortunately, if Java 8 is also installed on the system, it seems to get precedence over the bundled Java. Programatically adding the environment variables, as we do in ensureEnvironmentVariablesAreSet(), still allows Heritrix to use the system Java (version 8 as of a fresh Windows 10).

machawk1 added a commit that referenced this issue May 21, 2019
@machawk1
Copy link
Owner Author

machawk1 commented May 21, 2019

This may have to do with how Heritrix is invoked on startup. Check Heritrix.launchHeritrix().

Running the invocation: C:\wail\bundledApps\heritrix-3.2.0\\bin\heritrix.cmd -a lorem:ipsum from the command line (understandable) still causes the system Java to be used outside of the context of WAIL. I believe that the environment variables like JAVA_HOME can be used by preceding this invocation, which might be a solution to remedy WAIL using the system Java.

EDIT

heritrix.cmd also provides this option within the startup script:

::  Optional environment variables
:: 
::  JAVA_HOME        Point at a JDK install to use.

EDIT2

...or perhaps not. There does not appear to be a flag to send this value to Heritrix but instead, it ought to be EXPORTed prior to calling heritrix.cmd.

@machawk1
Copy link
Owner Author

machawk1 commented May 21, 2019

Having the WAIL-Win code in WAILConfig.py also set jreHome and javeHome prevents the Java8 from being used.

...but then the command to heritrix.cmd states "The system cannot find the path specified.\nPress any key to continue . . ." without giving more information on the command.

machawk1 added a commit that referenced this issue May 21, 2019
@machawk1
Copy link
Owner Author

machawk1 commented May 21, 2019

Heritrix now runs in Windows in 90cadc1 \o/

...but hitting the API to create and build scripts still seems problematic to the inaccessibility as reported on the CLI.

<urlopen error [WinError 10049] The requested address is not valid in its context>

This may be related to #435 with regard to Windows not being configured with localhost access to 0.0.0.0 by default.

As suspected, when Heritrix is running on Windows, the web interface is accessible using https://localhost:8443/ and not https://0.0.0.0:8443/ by default.

@machawk1
Copy link
Owner Author

Wayback appears to use the system Java by default as well per Tomcat startup.

@machawk1
Copy link
Owner Author

Modified the Windows catalina startup script in bd8c30d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant