On the Recent Progress on the Inapproximability of High Dimensional Clustering and the Johnson-Coverage Hypothesis
福建体彩网-英超直播k-median, k-means, and k-minsum are amongst the three most popular objectives for clustering algorithms. Despite intensive effort, a complete understanding of the approximability of these objectives remains a major open problem. In this paper, we significantly improve upon the hardness of approximation factors known in literature for these objectives:
We show that it is NP-hard to approximate the following objectives:
• Continuous k-median to a factor of 2 − o(1); this improves upon the previous inapproximability factor of 1.36 shown by Guha and Khuller (J. Algorithms ’99).
• Continuous k-means to a factor of 4 − o(1) ; this improves upon the previous in- approximability factor of 2.47 shown by Guha and Khuller (J. Algorithms ’99).
• k-minsum to a factor of 1.415; this improves upon the APX-hardness shown by Guruswami and Indyk (SODA ’03). Furthermore, we show that our hardness of approximation result above for k-median and k-means is tight for a large range of settings.
In this paper, we introduce a new hypothesis called the Johnson Coverage Hypothesis (JCH), and show that together with generalizations of known embedding techniques, JCH implies hardness of approximation results for k-median and k-means in L_p-metrics for factors which are close to the ones obtained for general metrics. In particular, assuming JCH we show that it is hard to approximate the k-means objective:
• Discrete case: to a factor of 3.94 in the L_1-metric and to a factor of 1.73 in the L_2-metric; this improves upon the previous factor of 1.56 and 1.17 respectively, of Cohen-Addad and Karthik (FOCS ’19).
• Continuous case: To a factor of 1.36 in the L_2-metric; this improves upon the inapproximability factor of 1.07 given by Cohen-Addad and Karthik (FOCS ’19). We also obtain similar improvements under JCH for the k-median objective. Finally, we establish a strong connection between JCH and the long standing open problem of determining the Hypergraph Turán number. We then use this connection to prove improved SDP gaps (over the existing factors in literature) for k-means and k-median objectives. Joint work with Karthik C.S. and Euiwoong Lee.
福建体彩网-英超直播Vincent Cohen-Addad is a CNRS researcher at Sorbonne Université. He was a Marie Sklodowska-Curie fellow at the University of Copenhagen hosted by Mikkel Thorup. He did his Ph.D at the Département d'Informatique de l'École normale supérieure under the supervision of Claire Mathieu.