"Distributed Technologies for Social Networks" Summer School , 1-4 June 2014



Date: June 1-4 2014

Location: Stockholm, Sweden

Organised by: Royal Institute of Technology (KTH)


An iSocial research meeting was held in Barcelona, Spain on 19 September 2013. During this research meeting, the plans for research and collaboration between the fellows and iSocial partners were outlined. Furthermore, the project issues, the upcoming events and the online courses that will be organized by the project were discussed.


The summer school will take place in Auditorium E1 on KTH central campus (see directions) on June 1st and June 3rd and in Auditorium Q2 on June 4th (Wednesday).

On June 2nd, student presentations will take place in D2 and D3 auditoriums (parallel sessions).

Lunches will be served in Restaurant Q.

 

1. Program:

 

Sunday, June 1

(in Auditorium E1)

Monday, June 2

(In Auditoriums D2 and D3)

Tuesday, June 3

(in Auditorium E1)

Wednesday, June 4

(in Auditorium Q2)

9:00-10:00

Complex Structures and Collective Dynamics in Networked Systems: Foundations for Self-Adaptation and Self-Organization (Tutorial)

Speaker: Ingo Scholtes (ETHZ, Switzerland)

OSNs: The Wisdom of a Few and the Value of Users

Speaker: Ricardo Baeza-Yates (Yahoo! Research Labs, Barcelona)

 

In Auditorium D2

Moving out of flatland: analysis and mining of multiple social networks (Tutorial)

Speaker: Matteo Magnani (Uppsala University, Sweden)

Dynamic Context-Aware Data Protection Through Virtual Micro Security-Perimeters

Speaker: Saman Zonouz (University of Miami, USA)

Coffee Break
10:30-12:30

Complex Structures and Collective Dynamics in Networked Systems: Foundations for Self-Adaptation and Self-Organization (Tutorial cont.)

Speaker: Ingo Scholtes (ETHZ, Switzerland)

Short PhD student presentations

(parallel sessions among iSocial fellows presentations in Auditorium D2 and EMJD-DC students presentations in  Auditorium D3)

Moving out of flatland: analysis and mining of multiple social networks (Tutorial (cont.))

Speaker: Matteo Magnani (ETHZ, Switzerland)

Data Storage Solutions for Decentralized Online Social Networks – An overview

Speaker: Anwitaman Datta (NTU Singapore)

 


 

Large-scale Machine Learning with GraphLab (Tutorial)

Speaker: Danny Bickson (GraphLab Inc, USA)

Lunch (in Restaurant Q)
14:00-16:00

Complex Structures and Collective Dynamics in Networked Systems: Foundations for Self-Adaptation and Self-Organization (Tutorial cont.)

Speaker: Ingo Scholtes (ETHZ, Switzerland)

Short PhD student presentations


(parallel sessions among iSocial fellows presentations in Auditorium D2 and EMJD-DC students presentations in  Auditorium D3)

Speech Recognition at Google

Speaker: Parisa Haghani (Google, New York)

 


 

Building Blocks for Decentralized Online Social Networks

Speaker: Sonja Buchegger (KTH, Sweden)

Large-scale Machine Learning with GraphLab (Tutorial cont.)

Speaker: Danny Bickson (GraphLab Inc, USA)

Coffee Break
16:30-18:30

Complex Structures and Collective Dynamics in Networked Systems: Foundations for Self-Adaptation and Self-Organization (Tutorial cont.)

Speaker: Ingo Scholtes (ETHZ, Switzerland)

Short PhD student presentations


(parallel sessions among iSocial fellows presentations in Auditorium D2 and EMJD-DC students presentations in Auditorium D3)

 

SOCIAL EVENT:

visit and dinner at ABBA museum.

Buses from KTH leave 17:30

 

 

Grafos.ML: Tools for large-scale graph mining and machine learning

Speaker: Dionysios Logothetis (Telefonica, Spain)

 

 


2. Invited Speakers:

Matteo Magnani

Uppsala University, Sweden

(Bio, Abstract),

Ingo Scholtes

ETHZ, Switzerland

(Bio, Abstract),

Danny Bickson

GraphLab Inc, USA

(Bio, Abstract),

Ricardo Baeza-Yates

Yahoo! Research Labs at Barcelona, Spain (Bio, Abstract),

Anwitaman Datta

NTU Singapore (Bio, Abstract),

Parisa Haghani

Google New York, Google Maps team (Bio, Abstract)

Sonja Buchegger KTH, Sweden

(Bio, Abstract)

Saman Zonouz

University of Miami, USA

(Bio, Abstract)

Dionysios Logothetis

Telefonica Research Lab, Barcelona, Spain

(Bio, Abstract)

3. Abstracts:

Keynote 1: Moving out of flatland: analysis and mining of multiple social networks (pdf) (Speaker: Matteo Magnani)

The last two decades have witnessed the proliferation of several Social Network Sites (SNSs). While it is not clear whether only one or few big SNSs will survive in the near future, or multiple specialized services will still exist separately, we can claim that a model based on a single layer of social connections will never be able to accurately describe our complex and layered online social experience: while Facebook connections can explain a lot about a user’s social life, his/her professional network may require an analysis of LinkedIn connections and his/her information consumption practices might be better explained by looking at his/her Twitter network. Recent works have re­defined the foundations of multi­layer network models highlighting the opportunity to apply SNA approaches to a wide range of complex social relationships as well as study the mutual influences between different co­existing networks. This tutorial will review the main theoretical models, data gathering methods and analytical tools to deal with multiple networks and to understand how a multi­layer network perspective may change our knowledge of user behaviors. Multiple online network analysis is a growing field, with long­standing theoretical bases rooted in classical sociological analysis and multiplex social network analysis methods. As such, it presents numerous research opportunities.

Keynote 2: Complex Structures and Collective Dynamics in Networked Systems: Foundations for Self-Adaptation and Self-Organization (Speaker: Ingo Scholtes)

This tutorial will provide an introduction to the methods and abstractions used in the quantitative study of complex structures and collective dynamical processes emerging in networked systems. Targeting at an audience of computer scientists and engineers, we particularly introduce the statistical physics perspective on self-organizing and self-adaptive network structures that is nowadays common in the modeling and analysis of complex systems occurring in biology, society, physics and technology. A particular emphasis will be placed on the evolution of robust and efficient network topologies based on simple, stochastic rules operating at the microscopic level. We further introduce the generating functions framework, which allows analyzing both the resilience and efficiency of network topologies based on a statistical description of connectivity patterns. In addition, the tutorial will cover the description and analysis of dynamical processes evolving on complex networks, thus providing methods to argue about the performance of distributed protocols.

A particular focus of the tutorial is the introduction of basic methods and abstractions which will enable attendees to benefit from the literature on self-organization and self-adaptation phenomena studied in the fields of statistical physics, network science and complex systems. The tutorial does not require prior knowledge in graph theory, network science or statistical physics, except for the most elementary knowledge in discrete math, probability theory and calculus.

Keynote 3: Large-scale Machine Learning with GraphLab (This email address is being protected from spambots. You need JavaScript enabled to view it. ">pdf) (Speaker:  Danny Bickson)

From social networks, to protein molecules and the web, graphs encode structure and context, enable advanced machine learning, and are rapidly becoming the future of big-data. In this tutorial we will present the next generation of GraphLab, an open-source platform and machine learning framework designed to process graphs with hundreds of billions of vertices and edges on hardware ranging from a single mac-mini to the cloud. GraphLab Create is a Python wrapper which allows easy installation, fast data science iteration and rapid deployment of complex machine learning models in production.

Keynote 4: OSNs: The Wisdom of a Few and the Value of Users (Speaker: Ricardo Baeza-Yates)

One of the main differences between traditional Web analysis and Online Social Networks (OSNs) studies, is that in the first case the information is organized around content, while in the second case it is organized around people. While search engines have done a good job finding relevant content across billions of pages, nowadays we do not have an equivalent tool to find relevant people in OSNs. Even though an impressive amount of research has been done in this direction, there are still a lot of gaps to cover. Although our fi rst intuition could be (and was!) search for popular people, previous research have shown that users' in-degree (e.g. number of friends or followers) is important but not enough to represent the importance and reputation of a person.

Another approach is to study the content of the messages exchanged between users, trying to identify topical experts. However the computational cost of such approach - including language diversity - is a big limitation. In our work we take a content-agnostic approach, focusing in frequency, type, and time properties of user actions rather than content, mixing their static characteristics(social graph) and their activities (dynamic graphs). Our goal is to understand the role of different types of users in OSNs: Who generates most of the content of the OSN? Do popular users create new trends and cascades? Do they add value to the network? And, if they don't, who does it? Our joint work with Diego Saez-Trumper and colleagues in Brazil and USA provides preliminary answers for these questions.

Keynote 5: Data Storage Solutions for Decentralized Online Social Networks – An overview (pdf) (Speaker: Anwitaman Datta)

This talk will provide a (non-exhaustive) overview of different storage architectures that have been explored in the context of decentralized online social networking, their relationship with other modules needed for a functional system, along with a qualitative discussion on the pros and cons of different architectures, and the settings where each design may (not) be applicable. It will be based on the speaker's own works, as well as that of others.

Keynote 6: Speech Recognition at Google  (Speaker: Parisa Haghani)

Many classical techniques in speech recognition were developed to tackle the lack of computing power or data sparsity in training models with large number of parameters. This picture has drastically changed in the recent years. In this talk, I will describe how speech recognition works at Google and how we leverage big data and computing at scale to improve and guarantee the quality of our speech recognition systems to millions of remote users simultaneously.

Keynote 7: Building Blocks for Decentralized Online Social Networks (pdf) (Speaker: Sonja Buchegger)

Decentralizing the typical functionality of current online social networking services poses many research questions by itself. Adding the requirement of privacy results in yet another layer of complication, as the removal of the single, centralized provider means that the security and privacy tasks it did have are gone as well. In this talk, we'll explore these challenges and present some of the building blocks we have come up with to provide decentralized privacy-preserving communication systems in general and social networking services in particular.

Keynote 8: Dynamic Context-Aware Data Protection Through Virtual Micro Security-Perimeters (pdf) (Speaker: Saman Zonouz)

Mobile devices are increasingly used by millions of users across the globe in a variety of roles such as a social networking member or as a professional to interact with other (potentially untrusted) peers. These uses cases often have wildly differing data protection requirements ranging from the access of sensitive corporate emails to the consumption of DRMed media to the production and sharing of personal content via online social networks. While it is desirable for such diverse content to be accessible from the same device via a unified user experience, today's mobile operating systems provide few, if any, facilities for fine-grained data protection and isolation. Consequently, heavyweight isolation schemes such as different apps, different virtual machines, or in the extreme, different devices for accessing different types of content (e.g., work vs. personal) are used, which provide a disharmonious experience for users. In this paper, we develop SWIRLS, a first class data protection architecture for mobile operating systems that uses information flow tracking to isolate data rather than isolating execution environments. Swirls tags data based on its security context as embodied in a capsule, and controls data mixing between capsules using capsule owner specified policies. In doing so, it provides users with a unified environment and app developers with an API to construct seamless user interfaces that allow access to data with different security requirements through the same apps without fear of any malicious or inadvertent data leakage.

Keynote 9: Grafos.ML: Tools for large-scale graph mining and machine learning (Speaker: Dionysios Logothetis)

Large-scale graph mining and machine learning is becoming an increasingly important area of big data analytics with applications from Online Social Network analysis to recommendations. This talk will describe grafos.ml, an umbrella project with the goal of building tools and systems for graph mining and ML analytics.
The first part of the talk will describe Okapi, an open source library of graph mining and ML algorithms built on top of the Giraph graph processing system. The goal of Okapi is to provide a rich toolkit of graph mining algorithms that will simplify the development of applications, such as OSN analysis at scale.
The second part of the talk will describe RT-Giraph, a system for mining large dynamic graphs. In many real-world scenarios, graphs are naturally dynamic and several applications, such as sybil detection in OSNs, require real-time updates upon changes in the underlying graph. However, existing graph processing systems are designed for batch, offline processing, making the analysis of dynamic graphs hard and costly. RT-Giraph is explicitly designed for dynamic graphs, allowing fast updates and making the deployment of real-time applications easier.


4. Bios:

Matteo Magnani graduated with honors in Computer Science at the University of Bologna in 2002. He also studied at the Université de Marne la Vallée (undergraduate level) and the Imperial College London (postgraduate research level). In 2006 he obtained a PhD in Computer Science (Bologna) where in 2011 he also graduated with honors in Violin. He has received a Rotary Prize for the best student of the Science Faculty (UniBO), a Best Paper Award at the ASONAM conference in 2011, a Funniest Presentation award at SBP 2010, the French qualification for Maître de Conférence positions, the Italian idoneità for CNR researcher positions, a Best Young Chess Player at a local tournament with two participants and his mother is very proud of him (or at least this is what she officially says). Before joining Uppsala University as a senior lecturer in Database Systems and Data Mining he has held positions at CNR, Italy (third-level researcher - or "ricercatore di ultima", in Italian), at the University of Bologna, Italy (assistant professor), and at Aarhus University, Denmark (research assistant professor level). His main research interests span Database and Information Management systems, specifically uncertain information management and multidimensional database queries, Network Science and Social Computing. He has written around 1.5 Kg of papers on these topics (when printed on heavy A4 size sheets), and he has an h-index. He has successfully attracted funding from Working Capital (Telecom Italia), PRIN and FIRB (MIUR - Italian Ministry for education, University and Research) schemes.

 

Ingo Scholtes is a postdoctoral researcher at the Chair of Systems Design at ETH Zürich and he is studying complexity and collective dynamics in information and communication systems. He studies all kinds of socio-technical systems found in collaborative software engineering processes and online social networks, but also distributed technical systems like computer networks, Peer-to-Peer systems, information systems. He has a background in computer science and mathematics and his approach can best be described as interdisciplinary, combining perspectives from computer science, mathematics, network science and the physics of self-organizing systems. His work experience covers research in distributed systems and computer networks, P2P systems, network theory and collaborative software engineering. From his involvement in the design and implementation of the Peer-to-Peer Event Monitoring System of the ATLAS detector at CERN, he also has practical experience in the design of very large-scale distributed systems. He has so far been co-organizer of nine satellite events at international conferences.

 

Danny Bickson is the co-Founder of GraphLab Inc. Previously, he was a research scientist at the machine learning department at Carnegie Mellon University, working on GraphLab open source project (http://graphlab.org). Danny is holding a PhD in distributed algorithms from the Hebrew University.

 

Ricardo Baeza-Yates is VP of Yahoo! Labs for Europe and Latin America, leading the labs at Barcelona, Spain and Santiago, Chile, since 2006. Between 2008 and 2012 he also oversaw the Haifa lab. He is also part time Professor at the Dept. of Information and Communication Technologies of the Universitat Pompeu Fabra in Barcelona, Spain. During 2005 he was an ICREA research professor at the same university. Until 2004 he was Professor and Director of the Center for Web Research at the Dept. of Computing Science of the University of Chile (in leave of absence until today). He obtained a Ph.D. from the University of Waterloo, Canada, in 1989. Before he obtained two masters (M.Sc. CS & M.Eng. EE) and the electrical engineering degree from the University of Chile in Santiago. He is co-author of the best-seller Modern Information Retrieval textbook, published in 1999 by Addison-Wesley with a second enlarged edition in 2011, that won the ASIST 2012 Book of the Year award. He is also co-author of the 2nd edition of the Handbook of Algorithms and Data Structures, Addison-Wesley, 1991; and co-editor of Information Retrieval: Algorithms and Data Structures, Prentice-Hall, 1992, among more than 500 other publications. From 2002 to 2004 he was elected to the board of governors of the IEEE Computer Society and in 2012 he was elected for the ACM Council. He has received the Organization of American States award for young researchers in exact sciences (1993), the Graham Medal for innovation in computing given by the University of Waterloo to distinguished ex-alumni (2007), the CLEI Latin American distinction for contributions to CS in the region (2009), and the National Award of the Chilean Association of Engineers (2010), among other distinctions. In 2003 he was the first computer scientist to be elected to the Chilean Academy of Sciences and since 2010 is a founding member of the Chilean Academy of Engineering. In 2009 he was named ACM Fellow and in 2011 IEEE Fellow.

 

Parisa Haghani is a member of Google Speech Processing team where she works on developing techniques and building systems to leverage large amounts of data with the goal of improving the quality of speech recognition at Google. She received a Ph.D. in computer science from École Polytechnique Fédérale de Lausanne (EPFL) and an M.Sc. from University of Illinois at Urbana-Champaign (UIUC) before joining Google.

 

Anwitaman Datta did his PhD at EPFL Lausanne, Switzerland before moving to NTU Singapore in 2006, where he is currently an Associate Professor in the School of Computer Engineering. He is interested in reliability and security of large scale distributed & decentralized systems. He leads the SANDS (S-* and Algorithmic aspects of Networked Distributed Systems) research group at NTU Singapore.

 

Sonja Buchegger is an associate professor of Computer Science at KTH, Stockholm, Sweden, at the School of Computer Science and Communication (CSC), in the Theoretical Computer Science department (TCS), and serves as the vice director of the VR ACCESS Linnaeus Centre. Prior to KTH, she was a senior research scientist at Deutsche Telekom Laboratories, Berlin, Germany, a post-doctoral scholar at the University of California at Berkeley, School of Information, and a researcher at the IBM Zurich Research Laboratory. Her Ph.D. is in Communication Systems from EPFL, Lausanne, Switzerland, and she graduated in Computer Science at the University of Klagenfurt, Austria.

 

Saman Zonouz is an Assistant Professor in the Electrical and Computer Engineering Department at the University of Miami (UM) since August 2011, and the Director of the 4N6 Cyber Security and Forensics Laboratory. He has been awarded the Faculty Fellowship Award by AFOSR in 2013, the Best Student Paper Award at IEEE SmartGridComm 2013, the EARLY CAREER Research award from the University of Miami in 2012 as well as the UM Provost Research award in 2011. The 4N6 research group consists of 1 post-doctoral associate and 8 Ph.D. students, and their research has been funded by approximately $4 million in grants from NSF, ONR, DOE/ARPA-E, and Fortinet Corporation. Saman's current research focuses on systems security and privacy, trustworthy cyber-physical critical infrastructures, binary and malware analysis, as well as adaptive intrusion tolerance architectures. Saman has served as the chair, program committee member, and a reviewer for international conferences and journals. He obtained his Ph.D. in Computer Science, specifically, intrusion tolerance architectures for the cyber-physical infrastructures, from the University of Illinois at Urbana-Champaign in 2011.

 

Dionysios Logothetis is an Associate Researcher with the Telefonica Research lab in Barcelona, Spain. His research interests lie in the areas of large scale data management with a focus on graph mining, cloud computing and distributed systems. He holds a PhD in Computer Science from the University of California, San Diego and a Diploma in Electrical and Computer Engineering from the National Technical University of Athens.

 

5. iSocial Fellows Presentations in D2 auditorium:

  Title Presenter Partner Institution
10:40-11:00

Modeling the Evolution of Online Social Networks

Kaj Kolja Kleineberg

UB

11:00-11:20

Situation-Aware Social Overlay (pdf)

Hariton Efstathiades

UCY

11:20-11:40

A WebRTC DHT (pdf)

Andres Ledesma

UCY

11:40-12:00

Hive.js: Browser-Based Distributed Caching for Adaptive Video Streaming

Mikael Högqvist

PEER

12:00-13:30 Lunch (in Restaurant Q)
13:30-13:50

ElastO: Efficient Maintenance of Scalable Overlays for Topic-based Publish/Subscribe under Churn (pdf)

Chen Chen

IBM Haifa

13:50-14:10

RankSlicing: A Decentralized Protocol for Supernode Selection

Giovanni Simoni

PEER

14:10-14:30

Identity Validation in Online Social Networks (pdf)

Leila Bahri

INSUB

14:30-14:50

Ensemble Learning for Online Social Networks (pdf)

Amira Soliman

KTH

14:50-15:10

Risk Assessment in Social Networks Based on Anomalous Behavior Detection (pdf)

Naeimeh Laleh

INSUB

15:10-15:40 Coffee Break
15:40-16:00

Identifying Trending SPAM in Twitter (pdf)

Despoina Antonakaki

FORTH

16:00-16:20

Shared Content Risk in Social Networks and Access Control (pdf)

Panagiotis Ilia

FORTH

16:20-16:40

Load Balancing in Stream Processing System

Muhammad Anis Uddin Nasir

KTH

16:40-17:00

Large Scale Cross-Document Coreference Resolution (pdf)

Kambiz Ghoorchian

KTH

17:30 Bus to ABBA Museum

 

6. EMJD-DC Fellows Presentations in D3 auditorium:

  Title Presenter Partner Institution Year  
10:30-10:45

Resource, Data, and Application Management for Cloud Federations and Community Clouds

Vamis Xhagjika

UPC-KTH

2

Leandro Navarro

Vladimir Vlassov

10:45-11:00

Self Management for Large Scale Distributed Infrastructures

Navaneeth Rameshan

UPC-KTH

2

Leandro Navarro

Vladimir Vlassov

11:00-11:15

Community Cloud: Collaborative Multi-Cloud Ecosystems in Community Networks

Amin Khan

UPC-IST

2

Felix Freitag

Luís Rodrigues

11:15-11:30

Data-Intensive Computation: Programming Frameworks for Big Data

Vasia Kalavri

KTH-UCL

2

Vladimir Vlassov

Peter Van Roy

11:30-11:45

Energy aware economic modeling for community clouds

Leila Sharifi

IST-UPC

2

Luis Veiga

Felix Freitag

11:45-12:00

Self Management for Distributed Storage Systems (Working Title)

Ying Liu

KTH-UCL

2

Vladimir Vlassov

Peter Van Roy

12:00-13:30 Lunch (in Restaurant Q)
13:30-13:45

 

Manuel Bravo

UCL-IST

1

Peter Van Roy

Luis Rodrigues

13:45-14:00

 

Zhongmiao Lio

UCL-IST

1

Peter Van Roy

Paolo Romano

14:00-14:15

 

Jingna Zeng

KTH-UCL

 

 

 

14:15-14:30

Automated Planning for Self-Adaptive Systems

Richard Gil

IST-UCL

1

Luis Rodrigues

Peter Van Roy

14:30-14:45

Trusted Mobile Computing

Sileshi Demesie

IST

1

Miguel P. Correia

14:45-15:00

Community-driven cloud computing at the edge

Mennan Selimi

UPC

1

Felix Freitag

15:00-15:15

Energy Efficiency and Transactional Memories

Shady Issa

IST

1

Paolo Romano

15:15-15:45 Coffee Break
15:45-16:00

Community Networks

Emmanouil Dimogerontakis

UPC

1

Leandro Navarro

16:00-16:15

 

Nicholas Rutherford

UCL-KTH

2

Peter Van Roy

Vladimir Vlassov

16:15-16:30

Programming Models for Complex Systems

Ruma Paul

UCL-KTH

1

Peter Van Roy

Vladimir Vlassov

16:30-16:45

Architecture Support for Big Data Analytics

Ahsan Javed Awan

KTH-UPC

1

Mats Brorsson

16:45-17:00

 

 

 

 

 

17:30 Bus to ABBA Museum

 

 

7. Photos

 

summer school3 summer school1

summer school2 summer school4

summer school5 summer school6

summer school9 summer school8

summer school10  summer school7