Skip to content

Commit

Permalink
Merge pull request #49 from DataONEorg/dev-0.1
Browse files Browse the repository at this point in the history
Dev 0.1
  • Loading branch information
rushirajnenuji authored Jan 4, 2019
2 parents 8079a7c + 6b80846 commit caeae83
Show file tree
Hide file tree
Showing 57 changed files with 11,497 additions and 2 deletions.
118 changes: 118 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
#might have passwords in it
localconfig.ini

# Compiled class file
*.class

# Log file
*.log
/logs/*

# BlueJ files
*.ctxt
Expand All @@ -20,3 +24,117 @@

# virtual machine crash logs, see http://www.java.com/en/download/help/error_hotspot.xml
hs_err_pid*

# Mac OS Directory structure
.DS_Store
src/.DS_Store

src/metricsServiceAPI/.idea/workspace.xml

#Pycharm workspace files
.idea

# Byte-compiled / optimized / DLL files
__pycache__/
*/__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
doc/build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# dotenv
.env

# virtualenv
.venv
venv/
ENV/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/

src/d1_metrics/d1_metrics/reports/
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,4 @@ limitations under the License.

This material is based upon work supported by the Alfred P. Sloan Foundation
as part of the Make Data Count Project. https://makedatacount.org

23 changes: 23 additions & 0 deletions doc/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# requires sphinx-autobuild to be available, install with pip
livehtml:
sphinx-autobuild -b html "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
29 changes: 29 additions & 0 deletions doc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
This is a Sphinx project. HTML can be built using:

```
make clean html
```

The html files will appear under `build/html/`

To run make, ensure that spinx is installed:

```
pip install -U sphinx
```

Convenient for editing is a "live view", which automatically rebuilds
the docs and refreshes the browser page after edits are saved.

```
pip install -U sphinx-autobuild
make clean livehtml
```

Then open a browser at http://localhost:8000/


The document will update in your browser as edits are saved. In some
cases edits may not appear as expected, in that case just run
the ``make clean livehtml`` command and refresh the browser.
Binary file added doc/images/dataone-implementation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/images/mdc-log-processing-architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/images/metrics-service-class-diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
35 changes: 35 additions & 0 deletions doc/plantuml-styles.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
' change the default styles
skinparam shadowing false
skinparam roundcorner 10

' style sequences
skinparam sequence {
ArrowColor #1F4260
LifeLineBorderColor #1F4260
LifeLineBackgroundColor #428BCA
ParticipantBorderColor #AAAAAA
ParticipantBackgroundColor #F5F5F5
ActorBackgroundColor #DDDDDD
ActorBorderColor #333333
}

' style notes
skinparam noteFontColor #C49858
skinparam note {
BackgroundColor #FCF8E4
BorderColor #FCEED6
}

' style classes
skinparam class {
BackgroundColor #F5F5F5
BorderColor #333333
ArrowColor #333333
}

' style packages
skinparam packageFontColor #9DA0A4
skinparam package {
BorderColor #CCCCCC
}

1 change: 1 addition & 0 deletions doc/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
plantweb
107 changes: 107 additions & 0 deletions doc/source/apache_log.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
Log Formatting
==============


Apache2 log formatting for the search UI. The DataONE `Search UI`_ is a single page application served from an Apache
web server. All requests issued by the Search UI are proxied back through the apache server, hence the logs for that
server provide a good location for logging requests issued by the service.

.. list-table:: Apache search log properties
:widths: 10 10 30
:header-rows: 1

* - Property
- Entry
- Notes
* - ``ver``
- ``1.0``
- Version flag, indicating the revision of the log information. Must be incremented when format changes.
* - ``time``
- ``%{%Y-%m-%d}tT%{%T}t.%{msec_frac}tZ``
- Time that the request started
* - ``remoteIP``
- ``%a``
- IP address of the requestor
* - ``method``
- ``%m``
- HTTP method used for the request
* - ``request``
- ``%U``
- The request portion of the URL, i.e. after host and before query
* - ``query``
- ``%q``
- The query portion of the URL, i.e. after the "?"
* - ``userAgent``
- ``%{User-agent}i``
- Name of the client user agent
* - ``remoteUser``
- ``%u``
- Remote user identity, only if HTTP authentication used.
* - ``referer``
- ``%{Referer}i``
- URL of the page that requested the resource
* - ``status``
- ``%>s``
- HTTP status of the response
* - ``responseTime``
- ``%D``
- The time in microseconds taken by the server to respond
* - ``accessToken``
- ``%{Authorization}i``
- The accessToken can be decoded using pyjwt for example::

accessToken = "Bearer eyJhbGciOiJSUzI1NiJ ..."
junk, token = accessToken.split(" ")
print( jwt.decode(token, verify=False) )
{'sub': 'http://orcid.org/0000-...',
'fullName': 'Dave Vieglais',
'issuedAt': '2018-10-06T12:28:44.156+00:00',
'consumerKey': '...',
'exp': 1538893724,
'userId': 'http://orcid.org/0000-...',
'ttl': 64800,
'iat': 1538...}
* - ``ga_cookie``
- ``%{_ga}C``
- The value of the Google Analytics _ga cookie. `More details <https://stackoverflow.com/questions/16102436/what-are-the-values-in-ga-cookie>`_


Apache configuration for logging search events::

LogFormat "{ \"ver\":\"1.0\", \"time\":\"%{%Y-%m-%d}tT%{%T}t.%{msec_frac}tZ\", \"remoteIP\":\"%a\",
\"request\":\"%U\", \"query\":\"%q\", \"method\":\"%m\", \"status\":\"%>s\",
\"responseTime\":\"%T\", \"userAgent\":\"%{User-agent}i\",
\"accessToken\":\"%{Authorization}i\", \"referer\":\"%{Referer}i\",
\"remoteuser\":\"%u\", \"ga_cookie\":\"%{_ga}Ci\"}" searchstats


Example of output, reformatted for readability. Each log message appears as a single line in the log file.

.. code-block:: json
{
"ver": 1.0,
"time": "2018-10-08T07:56:41.600Z",
"remoteIP": "73.128.224.157",
"request": "/cn/v2/meta/solson.18.1",
"query": "",
"method": "GET",
"status": "200",
"responseTime": "7504774",
"userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36",
"accessToken": "-",
"referer": "https://search.dataone.org/view/doi:10.5063/F1HT2M7Q",
"remoteUser": "-"
}
See also:

* `Apache log files`_ for general information on configuring Apache logs
* `mod_log_config`_ for log format options


.. _Search UI: https://search.dataone.org/
.. _Apache log files: https://httpd.apache.org/docs/2.4/logs.html
.. _mod_log_config: https://httpd.apache.org/docs/2.4/mod/mod_log_config.html
Loading

0 comments on commit caeae83

Please sign in to comment.