UIUC Research Park

Date: July 24, 2019
Time:: 9:00 am - 1:00 pm
Venue: UIUC EnterpriseWorks Room 130
Please find the slides at: https://socialmediaie.github.io/tutorials/UIUC2019/Tutorial-Slides-UIUC-Research-Park-24_07_2019.pdf
Contact: Shubhanshu Mishra at https://twitter.com/TheShubhanshu

Tutorial description

This will be a 3-hours long tutorial session using Python based, open source tools. The tutorial will be structured as follows:

Introduction (15 mins)

Familiarize participants with various IE tasks for tweets, e.g.:

Sequence tagging : named entity detection and classification, part of speech tagging, chunking, and super-sense tagging.
Text classification : sentiment prediction, sarcasm detection, and abusive content detection.

Applications of information extraction (15 mins)

This includes:

Query-based search on text corpora.
Visualizing temporal trends in information.

Responsible and compliant data use of tweets (15 mins)

Overview on available annotated tweet datasets.
Clarify on terms of service, regulations such as privacy policies, and norms for working with tweets.

Break (15 mins)

Hands on session (1 hr. 30 mins)

Setup Google colaboratory and install required dependencies (takes 15 mins) -https://colab.research.google.com/drive/1YHMyGsnzUjTQ2GcRomGY5SD5eVPA1siR
~~Collecting and sharing samples of tweet data, with focus on following Twitter's terms of service and additional community norms.~~ Covered in slides.
~~Efficiently annotating classification data using active human-in-the-loop learning.~~
Using TwitterNER for feature based high accuracy named entity recognition for Tweets - https://colab.research.google.com/drive/13u3Ox6UX0C4eeySPy61ciVcEVf7a86qU
Using Multi-task learning for sequence tagging - https://colab.research.google.com/drive/1YhFsbVeSuXHHhtgKn5GFczj1FOTE44lT
Using Multi-task learning for text classification - https://colab.research.google.com/drive/1pkE-GCKecWnzl5VygaZUCmneyNQuf2wr
~~Visualize extracted information and tweets using temporal network visualizations.~~ Covered in slides. See: https://shubhanshu.com/social-comm-temporal-graph/

NOTE: Access to SocialMediaIE library used for Multi-task learning was provided privately to the tutorial participants. We plan to release it as an open source library in coming months. You can check the status at: https://socialmediaie.github.io/

Conclusion (15 mins)

Resources to follow up and questions from participants.

Hands on advanced machine learning for information extraction from tweets --- tasks, data, and open source tools

University of Illinois at Urbana-Champaign, Research Park on July 24th 2019. 9:00 a.m. - 1:00 p.m.