In order to effectively work during this course, you need to take the following steps:
If you haven't already done so, fill in the Google form to let us know what your skill levels are at. We will then assign you to a groups of max. 3 people based on your level of knowledge in programming and (bio)medicine.
The first step is to create a GitHub account for yourself. You will need this to share your code and final product later on. You can create an account here. If you need more help, please check this document.
With your group, you will now need to create a GitHub repository. A repository is a place where you will store all files related to your project. More information about repositories can be found here.
A tutorial on how to create such a repository can be found here. Make this repository public and add your teammates as members/owners. In addition, add all four teachers to your repository as members/owners as well. See the teacher section above for their github usernames.
Other relevant resources for using GitHub that may come in handy:
- An Intro to Git and GitHub for Beginners (Tutorial)
- Git and GitHub for Beginners - Crash Course (video)
Semantic web is an extension of the world-wide web that makes the internet machine-readable and therefore machine-actionable. The Resource Description Framework (RDF) is the core data model on which the semantic web is built. The goal of this course is to use publicly available semantic web data to solve a research question. This requires RDF. Here are a few resources that will allow you to understand RDF:
- Introduction to Semantic web (video)
- Learn RDF
- RDF Tutorial (video)
In order to access RDF data, you need to use the query language SPARQL. Here are some resources for SPARQL in general. In the next section, you will be able to use SPARQL in one of the data sources recommended for this project.
- SPARQL in 11 minutes (video)
- Introduction to SPARQL
- SPARQLing Biology: a beginners course
- Using SPARQL to query Life Science Databases
Open data resources on the internet will need to be the main resource for data and information in this course. Wikidata is one of these and houses tons of information about biology and medicine as well.
We therefore focus on and recommend Wikidata as your primary data source, but you are free to choose another open data source if you wish to.
For those unfamiliar with Wikidata, there are several resources available to learn more:
- Wikidata:Introduction
- Example Wikidata entry: Douglas Adams
- Extensive tutorial on Wikidata basics (video; also includes wikidata query service, which we will discuss next)
- History of Wikidata
In order to extract data from Wikidata (or other RDF resources), we need to use SPARQL through a query service. The Wikidata query service can be found here. This website basically allows you to write a query in SPARQL and retrieve the data you need. In order to become familiar with SPARQL and Wikidata, we recommend that you check the following tutorials/resources:
- Wikidata query helper
- Wikidata SPARQL tutorial
- Intro to querying with Wikidata
- How to use the Wikidata query service (video)
- Wikidata SPARQL Query tutorial (video)
Visualizing the output of your WikiData (or other) query is crucial in this course. This visualization will be done with JavaScript. For these visualizations, you will need a basic knowledge of JavaScript. Here are some tutorials to help you get started.
In this course we will not ask you to create a visualization in JavaScript from scratch. Instead, we recommend that you use D3.js. This is a library that allows you to use data, like the one you gather through SPARQL query services, and then visualize that data. The D3.js website offers a lot of documentation to help you create visualizations that suit your needs. It also provides you with a large collection of ready-to-use visualization examples. Use these to your advantage.
Here are some additional examples of D3.js:
- An introduction to d3.js in 10 basic examples
- Over 1000 D3.js Examples and Demos
- The Big List of D3.js Examples
And finally, the visualization needs to be placed inside a webpage, which will be the endproduct you will create during this course. This end-product will most likely use HTML. Here are a few resources to help you get started with HTML.