📡 Innovation Sweet Spots

Open-source code for data-driven horizon scanning

👋 Welcome!

Innovation Sweet Spots is an experimental, data-driven horizon scanning project, led by Nesta's Discovery Hub. Read more about our motivation on Medium, and check out our first report on green technologies.

We are building upon Nesta's Data Analytics Practice expertise and previous work on innovation mapping, leveraging data science and machine-learning methods to track the trajectory of innovations and technologies for social good.

By combining insights across several large datasets that are commonly only analysed in isolation, we paint a multi-dimensional picture of the innovations indicating the resources they are attracting and how they are perceived.

NB: The codebase and these guidelines are still under development, with several parts of the analyses being presently refactored into modules.

🛠️ Installation

Step 1. Check that you meet the data science cookiecutter requirements. In brief, you should:

Install the following components:
- gh, GitHub command line tool
- direnv, for using environment variables
- git-crypt, tool for encryption of sensitive files
Have a Nesta AWS account, and install and configure your AWS Command Line Interface

Step 2. Run the following command from the repo root folder:

make install

This will configure the development environment:

Setup the conda environment with the name innovation_sweet_spots
Configure pre-commit actions (for example, running a code formatter before each commit)
Configure metaflow

The expected output:

conda env create -q -n innovation_sweet_spots -f environment.yaml
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
/Library/Developer/CommandLineTools/usr/bin/make -s pip-install
source bin/conda_activate.sh && conda_activate &&  pre-commit install --install-hooks
pre-commit installed at .git/hooks/pre-commit
source bin/conda_activate.sh && conda_activate &&  /bin/bash ./bin/install_metaflow_aws.sh
INSTALL COMPLETE

Step 3. Activate the newly created conda environment and you're good to go!

$ conda activate innovation_sweet_spots

💾 Datasets

To uncover research, investment and public discourse trends, we are presently using the following data:

Gateway to Research (GtR): Research projects funded by UKRI
Crunchbase: Global company directory
The Guardian news: to the best of our knowledge, the only major UK newspaper to make its text freely available for research.
Hansard: Records of parliamentary debates

All these datasets except Crunchbase are freely available. Note, however, that this project accesses some of these large datasets (namely GtR and Crunchbase) via our internal Nesta database and as such are intended for internal use.

In the future, we might add other datasets to our approach.

Click to read data access guidelines

Research project and company data

To download GtR and Crunchbase datasets from Nesta database, you will first need to decrypt the config files (if you don't have the key, reach out to Karlis).

$ git stash
$ git-crypt unlock /path/to/key

The most recent version of the Gateway to Research (GtR) and Crunchbase datasets can then be fetched by running the command below. Note that you need to be connected via Nesta's VPN when accessing the database.

$ python innovation_sweet_spots/pipeline/fetch_daps1_data/flow.py --no-pylint --environment=conda run

The Guardian news

We are using Guardian API to search for articles with specific key terms. For accessing the API, you you'll need to proceed as follows:

Request an API key from Guardian website (see here)
Store it somewhere safe on your local machine (outside the repo) in a .txt file
Specify the path to this file in .env file, by adding a new line with export GUARDIAN_API_KEY=path/to/file
Use the functions in innovation_sweet_spots.getters.guardian

To see examples of using our public discourse analysis tools, check innovation_sweet_spots/analysis/examples/public_discourse_analysis.

Hansard

Coming soon...

🤝 Contributor guidelines

Technical and working style guidelines

Project based on Nesta's data science project template (Read the docs here).

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.github		.github
bin		bin
docs		docs
innovation_sweet_spots		innovation_sweet_spots
inputs		inputs
outputs/figures		outputs/figures
.env.shared		.env.shared
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
environment.yaml		environment.yaml
jupytext.toml		jupytext.toml
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📡 Innovation Sweet Spots

👋 Welcome!

🛠️ Installation

💾 Datasets

Research project and company data

The Guardian news

Hansard

🤝 Contributor guidelines

About

Releases

Packages

Languages

License

ElenaMariani/innovation_sweet_spots

Folders and files

Latest commit

History

Repository files navigation

📡 Innovation Sweet Spots

👋 Welcome!

🛠️ Installation

💾 Datasets

Research project and company data

The Guardian news

Hansard

🤝 Contributor guidelines

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages