Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ckan 2.10 #255

Merged
merged 46 commits into from
Jul 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
87a584e
change CKAN version, modifiy dependency version, remove IRoute in plugin
Jin-Sun-tts Apr 24, 2023
ae95193
add the string function
Jin-Sun-tts Apr 24, 2023
4de6473
update: more validator functions
nickumia-reisys Apr 24, 2023
dc04662
fix: sysadminwithtoken import issue
nickumia-reisys Apr 24, 2023
31abfe5
update: ckanext.spatial.search_backend
nickumia-reisys May 1, 2023
ff259f7
fix: test_category_tags.py
nickumia-reisys May 1, 2023
027e503
fix: test_waf_trim_tags
nickumia-reisys May 1, 2023
fe78de0
fix: the default is solr-bbox, not solr now
nickumia-reisys May 3, 2023
240c9ca
lint: blah
nickumia-reisys May 3, 2023
c72c83c
new: add ckan 2.10 to test matrix
nickumia-reisys May 3, 2023
e34161b
update: IPackageController interfaces
nickumia-reisys May 4, 2023
31964c1
FIX: WHY CKAN... WHY....
nickumia-reisys May 11, 2023
53cd076
cleanup: Remove deprecated code
nickumia-reisys May 11, 2023
87fb934
temp: reference ckan 2.10 version of datajson
nickumia-reisys May 12, 2023
abffce8
fix: add localstack as a docker compose dependency
nickumia-reisys May 12, 2023
1253056
cleanup: debug statements
nickumia-reisys May 12, 2023
271171a
test: does a - vs. _ make a difference?
nickumia-reisys May 12, 2023
4f20053
test: see if license_list works with the newer commit
nickumia-reisys May 16, 2023
ea2ec9d
fix: 'egg=ckan'
nickumia-reisys May 18, 2023
8c1c888
added plugins for datajson test
Jin-Sun-tts May 22, 2023
8dd7277
test: no egg?
nickumia-reisys May 22, 2023
46a3ff7
fix: ckan can't be install editably
nickumia-reisys May 22, 2023
31b92ad
update: only reference the needed update
nickumia-reisys May 22, 2023
828c157
new: use custom fork with necessary fixes only
nickumia-reisys May 22, 2023
4f2813d
remove export_csv test
Jin-Sun-tts May 22, 2023
fe1aac1
remove miscsTopicCSV
Jin-Sun-tts May 22, 2023
b894350
reomve unused model for miscsTopicCSV
Jin-Sun-tts May 22, 2023
2bc2f55
fix lint
Jin-Sun-tts May 22, 2023
1d6a807
fix: 'resource' deprecated by 'asset'
nickumia-reisys May 23, 2023
936e85a
Merge branch 'ckan-2.10' of github.com:GSA/ckanext-geodatagov into ck…
nickumia-reisys May 23, 2023
c79cc7f
fix?: url_for?
nickumia-reisys May 24, 2023
99e2f88
fix: needs siteurl
nickumia-reisys May 25, 2023
5673600
return empty string when spatial can not be parsed
Jin-Sun-tts Jun 22, 2023
8ecd3e5
fix: don't update new-spatial if it's an empty str
nickumia-reisys Jun 27, 2023
de28a3e
new: change the way we capture word-based locations
nickumia-reisys Jul 10, 2023
0ea3e0a
update: datajson requirement
nickumia-reisys Jul 10, 2023
ab818ba
blah: ==
nickumia-reisys Jul 10, 2023
8275794
new: another version of ordering?
nickumia-reisys Jul 10, 2023
bbbc9fe
new: try the get_geo_from_string
nickumia-reisys Jul 11, 2023
fcf513d
update: only test CKAN 2.10
nickumia-reisys Jul 11, 2023
e7be2f4
Merge pull request #262 from GSA/ckan-210-spatial-test
nickumia-reisys Jul 11, 2023
8c1b2a9
Merge branch 'main' into ckan-2.10
nickumia-reisys Jul 11, 2023
cbbcf59
new: never let spatial be none
nickumia-reisys Jul 11, 2023
4ebd9bc
update: setup.py
nickumia-reisys Jul 11, 2023
fe3650f
docs: update readme
nickumia-reisys Jul 11, 2023
51cf4d5
bump CKAN version to 2.10.1
btylerburton Jul 13, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .env
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ CKAN_SMTP_PASSWORD=pass
CKAN_SMTP_MAIL_FROM=ckan@localhost

# Extensions
CKAN__PLUGINS=envvars image_view text_view recline_view datagov_harvest ckan_harvester geodatagov geodatagov_miscs z3950_harvester arcgis_harvester geodatagov_geoportal_harvester waf_harvester_collection geodatagov_csw_harvester geodatagov_doc_harvester geodatagov_waf_harvester spatial_metadata spatial_query s3test
CKAN__PLUGINS=envvars image_view text_view recline_view datagov_harvest ckan_harvester geodatagov geodatagov_miscs z3950_harvester arcgis_harvester geodatagov_geoportal_harvester waf_harvester_collection geodatagov_csw_harvester geodatagov_doc_harvester geodatagov_waf_harvester spatial_metadata spatial_query s3test datajson datajson_harvest

# Harvest settings
CKAN__HARVEST__MQ__TYPE=redis
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:
needs: lint
strategy:
matrix:
ckan-version: [2.9.5, 2.9, 2.9.7]
ckan-version: ['2.10', '2.10.1']
fail-fast: false

name: CKAN ${{ matrix.ckan-version }}
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
ARG CKAN_VERSION=2.9.5
ARG CKAN_VERSION=2.10.1
FROM openknowledge/ckan-dev:${CKAN_VERSION}
ARG CKAN_VERSION

Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CKAN_VERSION ?= 2.9.7
CKAN_VERSION ?= 2.10.1
COMPOSE_FILE ?= docker-compose.yml

build: ## Build the docker containers
Expand Down
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,12 @@ This extension is compatible with these versions of CKAN.
CKAN version | Compatibility
------------ | -------------
<=2.8 | no
2.9 | [complete](https://github.com/GSA/datagov-ckan-multi/issues/570)
2.9 | 0.1.37 (last supported)
2.10 | >=0.2.0

## Tests

All the tests live in the [/ckanext/geodatagov/tests](/ckanext/geodatagov/tests) folder. [Github actions](https://github.com/GSA/ckanext-geodatagov/blob/main/.github/workflows/test.yml) is configured to run the tests against CKAN 2.9 when you open a pull request.
All the tests live in the [/ckanext/geodatagov/tests](/ckanext/geodatagov/tests) folder. [Github actions](https://github.com/GSA/ckanext-geodatagov/blob/main/.github/workflows/test.yml) is configured to run the tests against CKAN 2.10 when you open a pull request.

## Using the Docker Dev Environment

Expand All @@ -61,7 +62,7 @@ To docker exec into the CKAN image, run:
### Testing

They follow the guidelines for [testing CKAN
extensions](https://docs.ckan.org/en/2.9/extensions/testing-extensions.html#testing-extensions).
extensions](https://docs.ckan.org/en/2.10/extensions/testing-extensions.html#testing-extensions).

To run the extension tests, start the containers with `make up`, then:

Expand Down Expand Up @@ -100,7 +101,7 @@ In order to support multiple versions of CKAN, or even upgrade to new versions
of CKAN, we support development and testing through the `CKAN_VERSION`
environment variable.

$ make CKAN_VERSION=2.9 test
$ make CKAN_VERSION=2.10 test

### Command line interface

Expand Down
22 changes: 1 addition & 21 deletions ckanext/geodatagov/blueprint.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from flask import Blueprint
from flask.wrappers import Response as response

from ckanext.geodatagov.model import MiscsFeed, MiscsTopicCSV
from ckanext.geodatagov.model import MiscsFeed


datapusher = Blueprint('geodatagov', __name__)
Expand All @@ -18,25 +18,5 @@ def feed():
return entry.feed


def csv(date=None):
if date:
entry = model.Session.query(MiscsTopicCSV) \
.filter_by(date=date) \
.first()
else:
entry = model.Session.query(MiscsTopicCSV) \
.order_by(MiscsTopicCSV.date.desc()) \
.first()
if not entry or not entry.csv:
abort(404, 'There is no csv entry yet.')
response.content_type = 'text/csv'
response.content_disposition = 'attachment; filename="topics-%s.csv"' % entry.date
return entry.csv


datapusher.add_url_rule('/usasearch-custom-feed.xml',
view_func=feed)
datapusher.add_url_rule('/topics-csv/{date}',
view_func=csv)
datapusher.add_url_rule('/topics-csv',
view_func=csv)
102 changes: 2 additions & 100 deletions ckanext/geodatagov/commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
from ckan.plugins.toolkit import config

from ckanext.harvest.model import HarvestSource, HarvestJob
from ckanext.geodatagov.model import MiscsFeed, MiscsTopicCSV
from ckanext.geodatagov.model import MiscsFeed


# https://github.com/GSA/ckanext-geodatagov/issues/117
Expand Down Expand Up @@ -566,104 +566,6 @@ def export_group_and_tags(packages, domain='https://catalog.data.gov'):
result.append(package)
return result

def export_csv(self, domain='https://catalog.data.gov'):
print('export started...')

# cron job
# paster --plugin=ckanext-geodatagov geodatagov export-csv --config=/etc/ckan/production.ini

# Exported CSV header list:
# - Dataset Title
# - Dataset URL
# - Organization Name
# - Organization Link
# - Harvest Source Name
# - Harvest Source Link
# - Topic Name
# - Topic Categories

import io
import csv

limit = 100
page = 1

import pprint

result = []

while True:
data_dict = {
'q': 'groups: *',
# 'fq': fq,
# 'facet.field': facets.keys(),
'rows': limit,
# 'sort': sort_by,
'start': (page - 1) * limit
# 'extras': search_extras
}

query = logic.get_action('package_search')({'model': model, 'ignore_auth': True}, data_dict)

page += 1
# import pprint
# pprint.pprint(packages)

if not query['results']:
break

packages = query['results']
result = result + GeoGovCommand.export_group_and_tags(packages=packages, domain=domain)

if not result:
print('nothing to do')
return

import datetime

print('writing into db...')

date_suffix = datetime.datetime.strftime(datetime.datetime.now(), '%Y%m%d')
csv_output = io.StringIO()

fieldnames = ['Dataset Title', 'Dataset URL', 'Organization Name', 'Organization Link',
'Harvest Source Name', 'Harvest Source Link', 'Topic Name', 'Topic Categories']

writer = csv.writer(csv_output)
writer.writerow(fieldnames)

for pkg in result:
try:
writer.writerow(
[
pkg['title'],
pkg['url'],
pkg['organization'],
pkg['organizationUrl'],
pkg['harvestSourceTitle'],
pkg['harvestSourceUrl'],
pkg['topic'],
pkg['topicCategories']
]
)
except UnicodeEncodeError:
pprint.pprint(pkg)

content = csv_output.getvalue()

entry = model.Session.query(MiscsTopicCSV) \
.filter_by(date=date_suffix) \
.first()
if not entry:
# create the empty entry for the first time
entry = MiscsTopicCSV()
entry.date = date_suffix
entry.csv = content
entry.save()

print('csv file topics-%s.csv is ready.' % date_suffix)
return result, entry

# this code is defunct and will need to be refactored into cli.py
"""
def jsonl_export(self):
Expand Down Expand Up @@ -838,7 +740,7 @@ def update_dataset_geo_fields(self):
# iterate over all datasets

search_backend = config.get('ckanext.spatial.search_backend', 'postgis')
if search_backend != 'solr':
if search_backend != 'solr-bbox':
raise ValueError('Solr is not your default search backend (ckanext.spatial.search_backend)')

datasets = model.Session.query(model.Package).all()
Expand Down
8 changes: 5 additions & 3 deletions ckanext/geodatagov/harvesters/arcgis.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@
from ckan.plugins.toolkit import add_template_directory, add_resource, requires_ckan_version
from ckan.plugins import IConfigurer

from ckanext.geodatagov.helpers import string as custom_string

requires_ckan_version("2.9")


Expand Down Expand Up @@ -118,7 +120,7 @@ def info(self):
def extra_schema(self):
return {
'private_datasets': [ignore_empty, boolean_validator],
'extra_search_criteria': [ignore_empty, str],
'extra_search_criteria': [ignore_empty, custom_string],
}

def gather_stage(self, harvest_job):
Expand Down Expand Up @@ -287,7 +289,7 @@ def import_stage(self, harvest_object):
package_schema = logic.schema.default_update_package_schema()

tag_schema = logic.schema.default_tags_schema()
tag_schema['name'] = [not_empty, str]
tag_schema['name'] = [not_empty, custom_string]
package_schema['tags'] = tag_schema
context['schema'] = package_schema # TODO: user

Expand All @@ -298,7 +300,7 @@ def import_stage(self, harvest_object):
# We need to explicitly provide a package ID, otherwise ckanext-spatial
# won't be be able to link the extent to the package.
package_dict['id'] = str(uuid.uuid4())
package_schema['id'] = [str]
package_schema['id'] = [custom_string]

# Save reference to the package on the object
harvest_object.package_id = package_dict['id']
Expand Down
3 changes: 2 additions & 1 deletion ckanext/geodatagov/harvesters/waf_collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
) # , validate_profiles; , validate_profiles
from ckanext.harvest.model import HarvestObject
from ckanext.harvest.model import HarvestObjectExtra as HOExtra
from ckanext.geodatagov.helpers import string


class WAFCollectionHarvester(GeoDataGovWAFHarvester):
Expand All @@ -26,7 +27,7 @@ def info(self):

def extra_schema(self):
extra_schema = super(WAFCollectionHarvester, self).extra_schema()
extra_schema["collection_metadata_url"] = [not_empty, str]
extra_schema["collection_metadata_url"] = [not_empty, string]
log.debug(
"Getting extra schema for WAFCollectionHarvester: {}".format(extra_schema)
)
Expand Down
3 changes: 2 additions & 1 deletion ckanext/geodatagov/harvesters/z3950.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
from ckan.logic.validators import boolean_validator

from ckan.plugins.toolkit import add_template_directory, add_resource, requires_ckan_version
from ckanext.geodatagov.helpers import string

requires_ckan_version("2.9")

Expand All @@ -43,7 +44,7 @@ def info(self):

def extra_schema(self):
return {'private_datasets': [ignore_empty, boolean_validator],
'database': [not_empty, str],
'database': [not_empty, string],
'port': [not_empty, convert_int]}

def gather_stage(self, harvest_job):
Expand Down
4 changes: 4 additions & 0 deletions ckanext/geodatagov/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,3 +61,7 @@ def get_harvest_source_config(harvester_id):
def get_collection_package(collection_package_id):
package = p.toolkit.get_action('package_show')({}, {'id': collection_package_id})
return package


def string(value):
return str(value)
Loading