Skip to content

MichaelEncalada/Taller_12_2023

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Workshop on Web Scraping and Text Information Retrieval

Overview

This workshop focuses on using Python programming tools for collaborative development, web data extraction, and information retrieval from documents. Participants will learn to:

  1. Use GitHub for collaborative projects.
  2. Develop effective Web Scraping tools.
  3. Retrieve and process relevant information from text documents.

You can find the workshop syllabus here

Repository Structure

  • sessions: Contains workshop sessions. It includes subfolders:

    • 1. Web Scraping: Includes Jupyter notebooks for classes.
    • 2. Text Information Retrieval: Includes Jupyter notebooks for classes.
    • data: Stores data used in the workshop.
  • proposals: Folder for students to upload their application proposals of the learned tools: web scraping and text information retrieval. Each student should create a folder with their name following the format: branch_{name}.

Synchronous Sessions

The link for the synchronous sessions is here

  • Meeting ID: 952 2258 5367
  • Access code: 636549

Recordings

You can log in with your PUCP account. The link for the recordings YouTube playlist is here

Working on the Cloud for the Text Information Retrieval Section

The Text Information Retrieval section involves installing and managing multiple dependencies if you choose to work locally. For those who prefer a cloud-based approach for this section, Colab-adapted scripts are available. You can access them here. To use these scripts, download the folder, unzip it, and upload it to your Google Drive. Alternatively, if you decide to work locally, follow the instructions provided in the notebooks for session 3 and session 4.

Instructor Information

  • Instructor Name: Josue Caldas
  • GitHub Profile: GitHub

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%