Clustering experiments with the Astro benchmarking data set with semantic document embeddings – off-the-shelf vs. custom embeddings created from citations, text, and both
Abstract:What accounts for the observed better quality of publication-level topical science clustering solutions which use only citation relations as input data, compared to those using sophisticated semantic similarity data derived from both citations and textual terms? A survey of empirical work relevant to the concept of unconscientious referencing practices indicates that purely citation-based methods should be affected by significant ‘citation noise’, unlike text-based methods. This study continues work with the A… Show more
Set email alert for when this publication receives citations?
Scite is an AI-powered platform that helps researchers discover and evaluate scientific literature through Smart Citations, showing whether studies support or contradict a claim. Now part of Research Solutions, Scite has indexed 1.6B+ citations, partners with 30+ publishers, and serves 2M users worldwide.