-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write documentation on how to add datasets to Community Data #78
Comments
Might this be a good time to start creating some documentation tasks and a small Help section on the main site? Of course, we could also do with fixing any interface label text that's not clearly explaining how to do this task. |
yes, some help text on the add dataset screen would be good. In the mean time, any clues so I can actually add something to COD would be welcome. We need to fix this so test users can populate the site. |
@GilesGibson Could you paste the exact dataset url you're using here please and I'll take a look. |
@GilesGibson when you add a dataset to CKAN, are you able to see the data rendered as a table properly? If not, its not parsing it properly inside CKAN (which is required for DU to read it). |
The instructions for adding files from CKAN are on Slack (step 4 can be swapped to our new url on COD): https://communityopendata.slack.com/files/kev/F02ES54BS/adding_a_ckan_datatable_via_data_unity Looking at the Lambeth Bus Stops dataset, it doesn't look like it's a CSV file, and it can't be previewed in CKAN so it wont be addable to DU in the current state. |
@pmackay when you uploaded/downloaded from the Lambeth web site all the datasets were they csv files? Not able to preview in CKAN. @dataunity thanks for the text guide. @djwesto can we add most of this to the add dataset page as a guide for users. A screen shot of the CKAN example of where to preview a csv file would also help. |
Most but not all are CSV files. The rest are .json files. The extension should show that. I dont know about the quality of the CSV data though. |
Google being Google is not showing easily what type of file it is apart from saying it is a spreadsheet. Is there an easy way of finding out? Details don't show this, when I click on it Google just shows it as a spreadsheet, cannot figure out how to get it to show the filetype. |
@GilesGibson to get the file as a CSV file from Google Spreadsheets you can open the file on Google Drive, then go "File -> Download as -> Comma-separated values". This should give you a CSV file you can upload to CKAN, then follow the loading instructions on Slack to get it onto COD (https://communityopendata.slack.com/files/kev/F02ES54BS/adding_a_ckan_datatable_via_data_unity). |
Ah, Google wasn't offering me csv format. However under "other file formats" it seems to download as csv. However, we now have the situation where all the files that @pmackay carefully extracted from the Lambeth web site and stored on google drive will need to be downloaded from google drive one by one and then uploaded again to the CKAN site? Is this the only way? Can CKAN not simple refer to google drive and save all the duplicate work? |
I have the files as CSV on my computer. But Google automatically converts them when uploaded. |
how frustrating. Any workaround otherwise there is going to be loads extra work to get these files on to CKAN. Dropbox them over? |
We're using CSV because it's an open format. Open Data should be in an open format for lots of reasons like it prevents vendor lock-in and needs no special tools for view/edit the files. I don't know of any bulk way to upload to CKAN (but there might be). I think each file will need it's own meta-data added though, so not sure it would help in this case? |
I realise that csv is the one to use. Just frustrating that all the csv files exist after downloading from Lambeth web site and now we cannot get a folder with them all in that is shared, google trying to be clever and getting in the way. All I want is a URL to give to CKAN without having to repeat all the previous work. |
OK, still stuck and missing the obvious. All Lambeth files have been extracted from Lambeth web site via a clever utility that Paul wrote. Uploaded to Google drive for all to see. Unfortunately Google converts them all to the Google spreadsheet format. If you download them to your local drive as a csv then CKAN will not let you put your local drive in as the source. Seems we still need a web site that we can store csv files on so that we can then get them to CKAN so that COD can get them. All very confusing unless I am missing something obvious. |
Is it possible to cut out the middle man (Google Spreadsheet)? Would it be possible for @pmackay to email the CSV files from his computer to @GilesGibson? It doesn't look like there's any extra metadata in the Google spreadsheets so don't think we'll loose any info. |
I realise that the metadata will have to be added one by one. Still haven't got to that situation yet and stuck on not being able to add a dataset that Lambeth created to Lambeth CKAN. It is just my lack of use of a CKAN site and how it works I think. |
@GilesGibson I've uploaded a zip file to https://drive.google.com/?authuser=0#folders/0B_wNdTyma3n1UjV4eTRBZWVURjA containing all the files. Can you access it? |
yes, I can access that, 125 files, many thanks. Where can I put them so that CKAN can refer to them via a URL or is there a way of uploading to CKAN? Maybe I missed that option. |
Hmm, CKAN has me beaten. I have now tried uploading a csv file to CKAN. Whenever I refer/preview to it on CKAN it just triggers a download. The add dataset option within COD still returns error 500. |
When I uploaded a file to CKAN I think I put CSV for the 'tags' option. This seemed to make CKAN recognise it as a CSV file. I don't know if that's the official way to do it though. I've just tried editing the existing Lambeth GP Surgeries. I changed format to CSV and that seemed to have fixed things: http://5.101.100.119/dataset/lambeth-gp-surgeries/resource/56281b66-fc71-4e34-ba82-31e192f98f3a |
That looks good - can see it shows the table of values. Can you import that |
@dataunity Thanks for looking at this. When I went to change the format it didn't allow anything, just blanked it out. |
Yes - was blank for me too. I just tried typing in 'csv' to see if it did anything. |
I am still stuck. I have managed to upload another 3 datasets to CKAN but cannot get them in to COD. Just get the error 500 every time I try. What does this error message mean? Is the data poor? Is the url wrong? some guidance or help for users would be useful. |
@GilesGibson have you been following the instructions mentioned above on Slack (https://communityopendata.slack.com/files/kev/F02ES54BS/adding_a_ckan_datatable_via_data_unity)? They should provide the guidance. Can you paste the urls you are trying to enter into the system here? |
I think I have been following them. This is one url from the box that sits above the preview. It previews OK in CKAN, just wont go across. |
@GilesGibson it's the third bullet point in the instructions which is the key one here:
So you take the url of the page where you can see the spreadsheet in CKAN, rather than the url of the CSV file itself. This lets Data Unity find the metadata about the dataset as well as just the CSV data. So for your example you would use this url: |
Ah, I had read that differently. maybe if we could add to the "Add dataset" page the instructions and to state that is the address in the browser bar not the one above the csv file. Currently it is trying to create a vis but is hanging there. Is it having difficulty translating from eastings/northings to lat/long? |
tried doing a graph with the betting shop dataset. I wanted it to total each occurrence (Y axis) per ward (x-axis) - still waiting on the build. |
I've clarified the instructions on Slack to be clearer. The process might take a little while depending on the size of the data. Small datasets take 10-15 secs, larger ones (like the Police data) take a minute or two. The process wont convert eastings/northings to lat/long. This needs the Data Unity data cleaning interface, but there has been no time to create that yet. |
Thanks for updating the instructions. @djwesto can we add in these instructions to the "add dataset" page where the DU widget lives? |
Progress so far. Most of Lambeth data sets are not lat/long. sent request to @pmackay if a few can be converted. I think I have managed to upload jan 2014 crime stats to CKAN site. However, it has errors displaying it so URL of browser will not work when pasted into COD. CKAN site now giving 504 errors when I try and explore the data. Leeds Data Mill - tried to get locations of gambling licence premises into COD. It isn't offered as a csv (only xml with schema). so I assume therefore that it isn't possible - @dataunity? |
@GilesGibson - that's right, not possible to parse XML. Generally every XML file has a different hierarchical structure which means they need customised parsers to extract data for visualisation. Perhaps you could leave a comment at the bottom of the dataset page on Leeds Data Mill asking if they could release as CSV? |
We've been adding datasets over the last few days, so closing this one. |
Add dataset page - still needs the text guide instructions for adding in new datasets. @djwesto ideally a nice bit of text with some helpful pickies. I am keen to make process this as easy as possible within the boundaries that it is a quickie bit of code doing it all. |
So is this issue getting assigned to me to write some documentation? |
combo effort? |
@djwesto there's already some documentation on Slack if it helps: https://communityopendata.slack.com/files/kev/F02ES54BS/adding_a_ckan_datatable_via_data_unity |
OK, I've finally managed to put a guide to adding datasets together here: http://live-communitydata.gotpantheon.com/adding-your-own-data Please can you all check through the process and make sure it's all correct? I think there are some weird problems with CKAN's metadata screens, particularly the fact that two concurrent screens both request a Title and Description for your datasets... Seems overkill so I just ignored it in the guide. Reassigning to @GilesGibson |
if we are now having the ability to add docs etc will these be via the CKAN server as well? If so the guide needs to reflect that. |
No, the docs support has nothing to do with CKAN. |
Apologies but I am having severe problems trying to figure out how to add a dataset to COD. I believe that I have to add it to the Lambeth CKAN site first. I have managed to upload the lambeth bus stops location datafile (taken from the google shared drive of datasets). I think it is there but I am unsure how I can make it appear under Lambeth organisation dataset rather than just on the list I have under my login.
I copy the URL info from the CKAN field and then go to COD, choose the add dataset option, click on the DU "go" and paste in the URL. I keep on getting error 500 messages and nothing more.
I would like to know what I am doing wrong and also how we could document the process so that users of COD could know the process of adding datasets.
The text was updated successfully, but these errors were encountered: