Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove console flow #140

Closed
wants to merge 3 commits into from

Conversation

max-sixty
Copy link
Contributor

The Console Flow is a strict subset of the functionality of the Local Webserver - if the browser fails to open, it's exactly the same. It makes writing parameterized tests harder, and is a small contributor to overall gunk

@codecov-io
Copy link

codecov-io commented Mar 3, 2018

Codecov Report

Merging #140 into master will decrease coverage by 44.64%.
The diff coverage is 85.71%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #140       +/-   ##
===========================================
- Coverage   75.62%   30.98%   -44.65%     
===========================================
  Files           8        8               
  Lines        1633     1630        -3     
===========================================
- Hits         1235      505      -730     
- Misses        398     1125      +727
Impacted Files Coverage Δ
pandas_gbq/gbq.py 20.7% <85.71%> (-55.67%) ⬇️
pandas_gbq/tests/test_gbq.py 26.78% <0%> (-56.64%) ⬇️
pandas_gbq/_load.py 62.5% <0%> (-35%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1155958...71faa19. Read the comment docs.

@max-sixty
Copy link
Contributor Author

Would close #53

@parthea
Copy link
Contributor

parthea commented Mar 4, 2018

Did you have any issue getting this to work on a headless machine? I cloned this branch on my headless machine and received an exception that the web server was already in use when trying to authenticate. I like that the console flow allows you to easily authenticate in terminal if you're using pandas-gbq on a headless machine.

>>> gbq.read_gbq('select 1','12345')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pandas_gbq/gbq.py", line 818, in read_gbq
    dialect=dialect)
  File "pandas_gbq/gbq.py", line 180, in __init__
    self.credentials = self.get_credentials()
  File "pandas_gbq/gbq.py", line 194, in get_credentials
    credentials = self.get_user_account_credentials()
  File "pandas_gbq/gbq.py", line 367, in get_user_account_credentials
    credentials = app_flow.run_local_server()
  File "/usr/local/lib/python2.7/dist-packages/google_auth_oauthlib/flow.py", line 407, in run_local_server
    host, port, wsgi_app, handler_class=_WSGIRequestHandler)
  File "/usr/lib/python2.7/wsgiref/simple_server.py", line 151, in make_server
    server = server_class((host, port), handler_class)
  File "/usr/lib/python2.7/SocketServer.py", line 417, in __init__
    self.server_bind()
  File "/usr/lib/python2.7/wsgiref/simple_server.py", line 48, in server_bind
    HTTPServer.server_bind(self)
  File "/usr/lib/python2.7/BaseHTTPServer.py", line 108, in server_bind
    SocketServer.TCPServer.server_bind(self)
  File "/usr/lib/python2.7/SocketServer.py", line 431, in server_bind
    self.socket.bind(self.server_address)
  File "/usr/lib/python2.7/socket.py", line 228, in meth
    return getattr(self._sock,name)(*args)
socket.error: [Errno 98] Address already in use

@max-sixty
Copy link
Contributor Author

max-sixty commented Mar 4, 2018

That's v weird. It works fine for me in headless:

In [1]: import pandas.io.gbq as gbq

In [2]: gbq.read_gbq('select 1 as a', project_id='sixty-capital', auth_local_webserver=True)
Please visit this URL to authorize this application: https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=495642085510-k0tm[]1jhre2nbqka17vqpjfddtd.apps.googleusercontent.com&redirect_uri=http%3A%2F%2Flocalhost%3A8080%2F&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fbigquery&state=b0aQA29ZDoyHys[]PIViG&access_type=offline

Is yours a one-off or something repeatable? If the latter, we could deprecate and change the default, but leave the False option and see if it's widespread?

@tswast
Copy link
Collaborator

tswast commented Mar 21, 2018

@craigcitro I believe I put the console flow in based on a conversation with you. Thoughts on this change?

@craigcitro
Copy link

In general, assuming that the user is at the same machine as the one running their python code is a faulty assumption IMO. That is, I don't believe that the local webserver flow handles all cases that the console flow does.

In particular, here's a simple recipe to reproduce: start a VM anywhere with only SSH access, connect from your laptop via SSH, and run pandas-gbq-using code there. Assuming you have not or cannot set up SSH forwarding of the right ports, there's no way for your laptop's web browser to talk to the local webserver running on your VM.

Or, a repro closer to my heart: open https://colab.research.google.com and try from there. There's no way for your local browser to talk to a random webserver started on that machine. (This is one of the reasons that we have some custom auth code in play.)

@max-sixty
Copy link
Contributor Author

@craigcitro when I try in a docker container (#140 (comment)), it prints out a web address whether or not I pass True or False to auth_local_webserver, because it can't find a browser.

If it doesn't have that in some situations, then I 100% agree with you. My point is based only there being a fallback, rather than whether that fallback is required.

@craigcitro
Copy link

@maxim-lian It will always print out a web address; one of the query parameters in that address is the redirect_uri.

  • If that uri is a real URL, eg http://localhost:12345/..., then when you click "authorize", your browser will send back a response containing your new oauth2 token to that URL. If that web request fails, there's no way to complete the authorization dance.

  • If that uri is the "oauth2 OOB uri" (urn:ietf:wg:oauth:2.0:oob), then when you click "authorize", you'll get a code you can copy-paste back to the calling application. This application can then make a request to the oauth2 server (accounts.google.com in this case) to exchange that code for the oauth2 token and related bits.

So if you opt to go with the local server flow, and there's no way for your browser to communicate with that "local" server, then you can no longer do user-based authentication.

@max-sixty
Copy link
Contributor Author

@craigcitro Ah! I see. Even though the initial link is to accounts.google.com, it needs to redirect to localhost. Thanks for bearing with me.

Would you prefer the default stayed as-is?

@craigcitro
Copy link

@maxim-lian No problem at all! 😁 I've been doing oauth2-related things since, like 2011, and I still get confused regularly. (But hey, job security!)

I think not using a local webserver is the right default. My $0.02:

  • local webserver: smoother when it works, but utterly confusing when it doesn't
  • oob flow: clunkier, but always works

I've copy-pasted enough auth codes to last me a lifetime, but I don't fancy users getting stuck. WDYT?

@max-sixty
Copy link
Contributor Author

Super, agree.

Closing

@max-sixty max-sixty closed this Mar 21, 2018
@max-sixty max-sixty deleted the remove-console-flow branch March 21, 2018 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants