Skip to the content.

CIKM 2022 Tutorial

Tutorial Home

Tutorial description

In this hands-on tutorial (details and material at:, we introduce the participants to working with social media data, which are an example of Digital Social Trace Data (DSTD). The DSTD abstraction allows us to model social media data with rich information associated with social media text, such as authors, topics, and time stamps. We introduce the participants to several Python-based, open-source tools for performing Information Extraction (IE) on social media data. Furthermore, the participants will be familiarized with a catalogue of more than 30 publicly available social media corpora for various IE tasks such as named entity recognition (NER), part of speech (POS) tagging, chunking, super sense tagging, entity linking, sentiment classification, and hate speech identification. We will also show how these approaches can be expanded to word in a multi-lingual setting. Finally, the participants will be introduced to the following applications of extracted information:

Pre-arrival material

Software setup


This will be a full day tutorial session using Python based, open source tools. The tutorial will be structured as follows:

More details below:

NOTE: This is tentative and will be updated before the actual tutorial.

Setup and Introduction (1 hr)

Applications of information extraction (1 hr)

Collecting and distributing social media data (30 mins)

Break 30 mins

Improving IE on social media data via Machine Learning (2 hr 30 mins)

Conclusion and future directions (10 mins)

Resources to follow up and questions from participants.