SandSlide is a tool to normalize slides in the conversion of pdf to PowerPoint by stripping each slide of its layout and assigning placeholders to maximize utility.
The data used in the experiments while evaluating SandSlide is found in this repository. full_annotation.json contains all the annotated data (500 slides) experiments.json contains all the slides that are equal or a superset of a slide layout In the Main folder, the whole dataset can be found.
The code of SandSlide will be released after acceptance