SIGMOD 2020 Database Programming Contest
Represented the University of Athens in the ACM SIGMOD 2020 Programming Contest, an entity resolution challenge requiring participants to match duplicate product specifications across 24 e-commerce websites. The system needed to identify which of ~30,000 camera product listings referred to the same real-world product, evaluated by F-measure on a secret held-out dataset.
Features
Entity Resolution Pipeline
Combined a graph-based approach for identifying structurally similar entities with an encoder-decoder model that produced similarity scores between product specifications. The graph captured relationships between candidate matches, while the neural model provided the similarity weights to drive final clustering decisions.
Contest Format
Competed as a 5-person team with a 12-hour training budget and 12-hour resolution window. Solutions were ranked by F-measure (harmonic mean of precision and recall) on held-out evaluation data disjoint from training labels.