-
CNR-ISTI, Pisa, Italy
- Via Giuseppe Moruzzi, 1, 56127 Pisa PI
-
03:30
(UTC +02:00) - https://orcid.org/0000-0001-5014-5089
Highlights
- Pro
Lists (2)
Sort Name ascending (A-Z)
Stars
An extensive and commented list of resources on Learned Sparse Retrieval.
[WACV 2026] Official implementation of the paper: “CountingDINO: A Training-free Pipeline for Exemplar-based Class-Agnostic Counting”
A high-performance TOON (Token Oriented Object Notation) parser and serializer for Python.
[CVPR 2026] Official Repository of the Paper "One Patch to Caption Them All A Unified Zero-Shot Captioning Framework"
Portable file server with accelerated resumable uploads, dedup, WebDAV, SFTP, FTP, TFTP, zeroconf, media indexer, thumbnails++ all in one file
[ICCV 2025] Official repository of the paper "Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation"
A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT.
[CBMI 2024 Best Paper] Official repository of the paper "Is CLIP the main roadblock for fine-grained open-world perception?".
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
[CVPR 2024 Highlight] Official repository of the paper "The devil is in the fine-grained details: Evaluating open-vocabulary object detectors for fine-grained understanding."
Code and Resources for the Transformer Encoder Reasoning and Alignment Network (TERAN), accepted for publication in ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)
The research tools developed for HoloLens2
Pytorch code for paper Conditioned Cooperative Training for Semi-supervised Weapon Detection.
Wedding Invitation Landing Page
Code to reproduce 'Combining GANs and AutoEncoders for efficient anomaly detection'
A Computer Vision Approach for Pass Detection on Soccer Broadcast Video
Pytorch implementation of Hebbian learning algorithms to train deep convolutional neural networks.
PyTorch port of models for Visual Sentiment Analysis pre-trained on the T4SA dataset.
Code to reproduce experiments in 'LSTM-based real-time action detection and prediction in human motion streams'
A deep-learning-based web tool for translational and real-time pupillometry
Code release for ConceptFusion [RSS 2023]
ImageBind One Embedding Space to Bind Them All
A collection of notebooks and scripts for the log analysis of the VBS22 post-evaluation, where the top-3 scoring teams solved 50 visuell know-item-search tasks with four individual users per team.
Official implementation of the paper "ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval"
Code to compute the nSimplex projection, that maps metric objects into a finite-dimensional Euclidean space