Workshop on Web Scraping and Text Information Retrieval

Overview

This workshop focuses on using Python programming tools for collaborative development, web data extraction, and information retrieval from documents. Participants will learn to:

Use GitHub for collaborative projects.
Develop effective Web Scraping tools.
Retrieve and process relevant information from text documents.

You can find the workshop syllabus here

Repository Structure

sessions: Contains workshop sessions. It includes subfolders:
- 1. Web Scraping: Includes Jupyter notebooks for classes.
- 2. Text Information Retrieval: Includes Jupyter notebooks for classes.
- data: Stores data used in the workshop.
proposals: Folder for students to upload their application proposals of the learned tools: web scraping and text information retrieval. Each student should create a folder with their name following the format: branch_{name}.

Synchronous Sessions

The link for the synchronous sessions is here

Meeting ID: 952 2258 5367
Access code: 636549

Recordings

You can log in with your PUCP account. The link for the recordings YouTube playlist is here

Working on the Cloud for the Text Information Retrieval Section

The Text Information Retrieval section involves installing and managing multiple dependencies if you choose to work locally. For those who prefer a cloud-based approach for this section, Colab-adapted scripts are available. You can access them here. To use these scripts, download the folder, unzip it, and upload it to your Google Drive. Alternatively, if you decide to work locally, follow the instructions provided in the notebooks for session 3 and session 4.

Instructor Information

Instructor Name: Josue Caldas
GitHub Profile:

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
proposals		proposals
sessions		sessions
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Workshop on Web Scraping and Text Information Retrieval

Overview

Repository Structure

Synchronous Sessions

Recordings

Working on the Cloud for the Text Information Retrieval Section

Instructor Information

About

Releases

Packages

Languages

MichaelEncalada/Taller_12_2023

Folders and files

Latest commit

History

Repository files navigation

Workshop on Web Scraping and Text Information Retrieval

Overview

Repository Structure

Synchronous Sessions

Recordings

Working on the Cloud for the Text Information Retrieval Section

Instructor Information

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages