INLG 2025 Tutorial: Large Language Models in Social Science: Methods, Applications, and Ethics

Half day tutorial on Oct 29 (afternoon)

This tutorial/workshop introduces Large Language Models (LLMs) in social sciences, offering hands-on experience (no coding or advanced maths needed) and critical discussion around methodological opportunities, limitations, and ethical concerns.

Organizers

Dr. Sree Ganesh Thottempudi

Centre for Augmented Intelligence and Data Science (CAIDS), UNISA

Prof. Dr. Ernest Mnkandla

Centre for Augmented Intelligence and Data Science (CAIDS), UNISA

Target Audience

Social scientists (faculty, researchers, graduate students) from disciplines such as political science, sociology, anthropology, communication, economics, and related fields. No prior programming experience required, but familiarity with social science research methods is assumed.

Workshop Overview

LLMs might be just the tool you need! This workshop introduces LLMs in social sciences, offering hands-on experience—no coding or advanced maths needed.

One can explore Natural Language Processing (NLP) and LLMs, with real-world applications like text classification, topic modeling, and text generation. We’ll use Python and Google Colab, even if you're new to programming—we'll guide you all the way.

The focus is on practical skills for your research. Need to analyze large, diverse texts like social media, interviews, or news? Want to detect trends, automate coding, or generate survey responses? LLMs are transforming text analysis in social sciences—and this is just the start.

You'll explore the basics of NLP and LLMs, diving into real-world applications like text classification, topic modeling, and text generation. We’ll be using Python and Google Colab, but don’t worry if you have no prior programming experience—we’ve got you covered every step of the way.

Recent advancements in LLMs—such as OpenAI’s GPT, Google’s Gemini, and Meta’s LLaMA—are transforming how social science research can be conducted. These models provide new tools for data collection, analysis, and theory-building, enabling researchers to work with text, language, and human behavior in innovative ways.

Our focus is on equipping you with practical skills that you can directly apply to your research. Need to analyze a large body of diverse literature, bulk social media discussions, interview responses, or news articles? Want to detect emerging trends, automate qualitative coding in interviews, or generate synthetic survey responses to test hypotheses? LLMs are revolutionising how we understand and process text-based data in the social sciences—and this is just the beginning.

This workshop introduces participants to the practical and theoretical uses of LLMs in social science research. It combines hands-on sessions with critical discussion around methodological opportunities, limitations, and ethical concerns.

Workshop Objectives

Participants will:

Grasp the fundamental functions and limitations of LLMs.
Explore applications of LLMs, including:
- Text generation and summarization
- Sentiment and discourse analysis
- Simulating human subjects and interviews
- Coding and classification in qualitative research
Receive practical experience with open-source LLM tools and APIs such as OpenAI and Hugging Face.
Critically assess the validity, biases, and ethical considerations of employing LLMs in social science research.

Tentative Agenda

Part 1: Foundations of Python and Intro to LLMs

Python coding and Google Colab
Foundational models at Hugging Face and proprietary models (OpenAI, DeepSeek, Gemini)
LLM workflow: data collection, prep, modelling, evaluation, improvement
Case: Analysing text dataset from loading to tasks like sentiment analysis, classification, summarisation, QA

Part 2: LLMs in Social Sciences

How LLMs work, use in social science research, and evaluating results
Word embeddings and Sentence transformers for social science
Limitations of pre-trained models, ways to improve results
Ethics, data, training, and use considerations
Case: data loading, tokenisation, embeddings, vector databases, inference, cost-cutting, results evaluation, insights

Part 3: Improving Results with LLM

Prompt Engineering for effective prompts
Fine-tuning and parameter-efficient tuning
Retrieval-Augmented Generation (RAG)
Reinforcement Learning from Human Feedback (RLHF)
Case: Customizing LLMs for domain-specific content, comparing approaches, and ethical considerations
Final: workshop, project feedback, resources, future collaborations

Learning Outcomes

By the end of the workshop, participants will be able to:

Identify appropriate LLM tools for their research questions
Construct meaningful prompts for data analysis or generation
Critically assess LLM outputs in light of social science standards
Navigate emerging ethical and methodological frameworks

Technical Requirements

Laptop with internet access
Access to an LLM platform (e.g., OpenAI, Claude, Hugging Face)
Optional: Jupyter notebooks or Google Colab (for hands-on session)

Additional Information

Supplementary readings and tutorials will be provided
Optional follow-up consultation for research design involving LLMs

INLG 2025

October 29 - November 2, 2025 @ Hanoi, Vietnam