Courses

Courses

Over the last three years, I have been teaching two courses at Leipzig University in alternating fashion. The first one, Toolbox CSS, deals with things related to using digital trace data and covers web scraping and basic text analysis. The second one, Text Mining for Social Scientists, is focused on the use of text data for sociologists. Moreover, I have taught an intro to R which covers basic data wrangling and visualization, and an introduction to web scraping as part of the Workshops for Ukraine series. All courses obey tidy principles and feature a bookdown. Some of the bookdowns even feature videos that walk you through the materials. Slides are available upon requests.

Currently I am preparing an updated and more extensive version of the Toolbox CSS course. Extensive means that the web scraping part will go beyond the current material and feature the acquisition of data from dynamic websites using Selenium. The Text Mining section will also cover newer developments in NLP (i.e., Large Language Models) and what they bring to the Social Sciences. Moreover, I will include sections on the analysis of spatial data and simulation of human behavior using agent-based models.

Toolbox CSS

Description: Recently, a “computational turn” has taken hold of the social sciences. Digital data and novel methods originating from the computer sciences offer important opportunities for sociology. The course starts with teaching practical skills to collect digital trace data online (web scraping, API “harvesting”). As the lion’s share of this material is in textual format, the students will subsequently learn to evaluate large text archives in an automated way through machine learning techniques. The programming of these tools is performed in R, for which basic knowledge is required. Students have to write an empirical paper that answers a sociologically relevant research question using at least one of the methods learned and give extensive feedback to others’ projects in class.

The bookdown script can be found here.

Text Mining for Social Scientists

Description: In the digital age, plenty of digital traces is readily available for social scientific inquiry. A large share of these traces is textual data. Due to their sheer size, a qualitative research strategy is often-times not suitable. Social scientists can, however, use automated, quantitative methods to derive information from text data to answer social-scientific questions. This course will introduce students to text mining methods in a theoretical and practical manner. Students will learn about the underpinnings and social scientific applications of quantitative text analysis and how to perform them in R. Hence, students should have a basic understanding of R. For examination, students will use the methods in empirical projects that deal with a social-scientific question from the realm of political sociology. En detail, students will form groups that try to replicate existing findings using new, textual data. The course will be split into thematic blocks, dealing with the logic of using text for social-scientific inquiry, text preprocessing, analysis, and the presentation of preliminary results.

The bookdown script can be found here.

Data cleaning for social scientists

Description: This workshop teaches efficient data handling techniques using R and tidyverse packages. Designed for social science researchers, the course covers data inspection, cleaning, transformation, and visualization. With a focus on practical applications and a broad range of tools, it aims to enhance participants’ skills for more robust and reproducible research. The fast-paced, two-day program provides hands-on experience with essential R packages, supported by open-access materials for continued learning.

The bookdown script can be found here.

Workshops for Ukraine: Web Scraping

Description: Digital trace data are an integral element of CSS (cool social scientific) research. This course will show you how this is done on an intermediate level. This implies that we will not cover the fundamentals of selecting and downloading things from static web pages on the one hand, but also not go as far as firing up RSelenium to scrape dynamic web pages on the other. We will start with a brief revision of CSS selectors, then we move on to rvest to simulate a browser session, fill forms, and click buttons. In the second half of the session, APIs and how to make requests to them will be covered. Tangible examples for API queries will be shown. In the end, exemplary workflows will be introduced to provide a scaffolding for students’ future research projects.

The course materials can be found on the Workshops for Ukraine website.