Graphs constitute the dominant data structure in the WWW, and appear essentially in all forms of information. Examples are the web graph, social networks, protein interaction networks, terms dependency graphs etc. Important knowledge is hidden in the macroscopic topology and features of these graphs.
The dominant knowledge artifacts extracted from graphs are either individual node based scores (such as authority/hubness/centrality) or unsupervised grouping of nodes in to clusters or global statistics computations (such as degree distributions etc). What is missing is metrics, structures and measures that represent the deeper knowledge hidden in the macroscopic structure of potentially directed/weighted graphs.
We propose new metrics and evaluation schemes for the macroscopic structure of graphs capitalizing on the degeneracy concept, i.e. k-cores, towards identifying the most cohesive components of graphs – finding thus the most collaborative constituents. These metrics compute the most robust subgraphs representing dense and mutual connectivity in the case of directed and weighted graphs as well. Connectivity can be then interpreted in several ways: i.e. as collaboration in citation or social networking graphs, collective affinity in protein interaction graphs etc. We also use the best k-cores of graphs as seeds for optimizing the speed of spectral graph clustering. We conducted several experiments on real (DBLP, Wikipedia) and synthetic data sets. The results are interesting especially in the case in the DBLP citation graph.
We further extend k-core to deal with directed graphs, introducing the D-core concept, as means of evaluating a digraph’s collaborative nature. Based on the D-core we devise a wealth of novel metrics used to evaluate the graphs collaboration features. We applied the above approaches on large real world graphs - Wikipedia and DBLP - and report interesting results.
A relevant prototype demo is available at: http://www.lix.polytechnique.fr/ giatsidis/cores/
Dr. Vazirgiannis is a Professor at LIX / Ecole Polytechnique. He holds a degree in Physics, a MSc. in Robotics, both from U. Athens, and a MSc. in Knowledge Based Systems from Heriot Watt University (in Edinburgh, UK). He acquired a Ph.D. degree in 1994 (Dept. of Informatics, U. Athens, Greece). Since then, he has conducted research in GMD-IPSI/Fraunhofer, Max Planck MPI (Germany), and in INRIA/FUTURS (Paris). He has been a teaching in AUEB (Greece), Ecole Polytechnique, Telecom- Paristech (France) and in Deusto University (Spain).
His current research interests are on Web Graph analysis & evolution monitoring graph clustering, algorithms for web advertising campaigns and community evaluation metrics. He has participated in many national and EU funded research and industrial projects. He has supervised eight completed PhD theses. He has contributed chapters in books and encyclopedias, published two books and more than a hundred twenty papers in international refereed journals and conferences. He is also co-author of two patents filed in the Greek patent office.
He is actively involved in national and international research & development projects. He has received the ERCIM and the Marie Curie EU fellowships. Currently he holds a DIGITEO Chair research grant in Ecole Polytechnique. He participates in the editorial board of the Intelligent Data Analysis Journal and serves as guest editor for “Machine Learning” and “Data Mining & Knowledge Discovery” journals. He co-chairs the PC committee of ECML/PKDD 2011 conference, has served the Data Mining Track chair of the IEEE - ICDE 2011 conference and has participated as a conference committee member for more than fifty international conferences, in the areas: Data Bases, Data Mining/Machine learning and the Web.
More information is available at: http://www.lix.polytechnique.fr/ mvazirg/index.php