Stars
Collaboratively merge and curate lists of social media accounts.
Simple but effective key term extraction for 40+ languages
A text mining course for social scientists and digital humanists
Source code accompanying the KONVENS 2019 paper "Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings"
Data and code for the paper "Fine-Grained Detection of Solidarity for Women and Migrants in 155 Years of German Parliamentary Debates".
Collectors of reputation metrics of public speakers on social media platforms
Repository for the DeTox project on detection of toxicity and agressions in postings: https://projects.fzai.h-da.de/detox/
This repository contains several text instances from different sources, which are annotated as either hate speech, offensive language or non-hate.
A plugin for twarc2 for converting tweet JSON into DataFrames and exporting to CSV.
Concise and helpful guidance to pursue legal and ethical research with digital trace data, particularly online communication and online media data.
Generates a Github Page from the Social Media Observatory Wiki with Bash, Python, Regexes and Jekyll.
Experiment code to the paper "Boundary Detection and Categorization of Argument Aspects via Supervised Learning" presented at the the 9th Workshop on Argument Mining 2022
A Telegram/telethon research convenience wrapper for the terminal.
Tool for extracting and saving news article metadata (and optionally content) at regular intervals.
Twitter data around the Ukraine Invasion in February 2022
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
JohannesBuchner / imagehash
Forked from bunchesofdonald/photohashA Python Perceptual Image Hashing Module
Implementation of topic models based on neural network approaches.
Active Learning for Text Classification in Python
grenwi / extract
Forked from ICIJ/extractA cross-platform command line tool for parallelised content extraction and analysis.
This is the website for the Language Technology and Data Analysis Laboratory (LADAL) which is part of the School of Languages and Cultures at the University of Brisbane, Australia.
This repository is the central communication and project management interface for the Social Media Observatory hosted by the Leibniz Insitute for Media Research | Hans-Bredow-Institute
A cross-platform command line tool for parallelised content extraction and analysis.
Information extraction and interactive visualization of textual datasets for investigative data-driven journalism and eDiscovery
Models for POS tagging and sentence and tokens detection with OpenNLP tools for italian language