Webinar Recap: Faculty Webinar Series|with Professor Bruno Abrahao

On the evening of December 17th, the Graduate Admissions Office of NYU Shanghai and NYU Stern MS programs held a webinar to introduce admissions information and application procedures of the two master’s programs: MS in Data Analytics and Business Computing (DABC) & MS in Quantitative Finance (QF). We were delighted to invite Professor Bruno Abrahao as the guest speaker of the webinar.

Professor Abrahao is currently an Assistant Professor of Information Systems and Business Analytics at NYU Shanghai and Global Network Assistant professor at NYU, teaching Network Analytics for MS in DABC program.

What are Networks?

Professor Abrahao began by answering the question “What are the networks?”

Machine Learning is currently widely used to understand the structure of a system and to make predictions. However, machine learning often relies on assumptions borrowed from statistics. If we have data on 5 million people, most methods assume that we have 5 million independent data points. Suppose we get each person’s age, gender and income level to build a model and predict things like unemployment rate, GDP, or sales of a given product. But now, we have data that reflect connections among entities, which lead to correlations. For example, if we want to predict support to the Democrats in the U.S. presidential election in a certain group, if a person and his family, relatives and friends are all in the target population, we can assume that they may think in the same way and are thus correlated. By doing this, we will notice lots of nearly identical data examples. As a result of the way people influence each other through networks, there will be significantly fewer useful data points available for the prediction.

Why are Networks Important?

First of all, the networks between data points cannot be ignored anymore as we not only measure data in volumes, but also at different levels. For instance, there is aggregated data from a population, as well as individual-level data that can be used to analyze behavior. 

correlations have complex structure

Secondly, we are socially dependent on others, and we communicate via connected technology, such as email, mobile phone. Our global information is organized as a world-wide-web. Two different web pages are connected in a way as you will be taken to another page if you click on a link when navigating a website. And our organizations are structured as a network of roles. Although there is a hierarchy in this network, for instance, the manager is under the boss, the secretary is under the manager, everyone inside this organization chart still communicates to each other. Moreover, our own biology consists of a network of proteins that interact, and our thoughts are produced by a network of cells that forms our brains.

What’s more, individual decisions are going to influence the whole system and also things that happened in some parts of the network are going to influence the whole network. Professor Abrahao explained this concept by giving us an example of investing in stock markets. The decision will be made as a network of agents, and personal success or failure are a result of everyone’s collective decisions.

The concept of networks can also be applied to recent events. We are still in the middle of a major pandemic, and we can regard the process of this pandemic as a network of people infecting each other through social interactions. Recently, researchers have been applying the techniques we study in this course for infection (or information) containment to build safeguards for the next pandemic.

pandemics spread through networks

Networks are abstractions that we use to analyze all kinds of data. Professor Abrahao showed us three examples of networks: Email Network, Protein Network and High School Dating Network.

Email network means that if a person sends an email to another person, a connection will be established between them. Protein Network can be explained as proteins with different functions interacting inside our bodies, hence forming a network. The High School Dating Network was originally used by sociologists to find out the students who have dated each other over a period of a few months. Among many fascinating questions related to social behavior, we can also use the structure of this network to study phenomena, such as the spread of sexually transmitted diseases.

The Use of Networks in the Industry

Regarding the application of networks in the social networks industry, Professor Abrahao first gave us an idea of how networks helped Facebook develop and become popular at its initial stage. Facebook started as an exclusive student group at Harvard University. Potential users all had an incentive to join Facebook as they wanted to become part of that elite group. After joining the restricted and exclusive group of Facebook, they would have a special status. As more and more people crave for this special status, more and more people want to join Facebook. This cascade effect in Facebook’s network helped it become so popular when other social networks went down.

Network analysis is an essential technique that supports the function of a trillion-dollar industry. No matter if you work in Apple, Microsoft, Alibaba or ByteDance, you will encounter challenges where you can apply network methods to solve problems. Networks can be helpful in the areas of marketing, recommendation, prediction, budgeting, market needs, strategy, containment of negative events, opinion research, and product development.

This course is about extracting knowledge, structure from massive networks. Students will learn to understand and make predictions about how people behave, and how networks can be analyzed to build more effective systems and markets.

Q&A

Q: How does network analytics help us determine the market efficiency in finance?

The course is divided into three parts. The first part is based on social networks and network structures, not only social but also information networks. The second part talks about dynamics, how phenomena take place over networks. The third part is entirely about markets, which is a complex topic, because there are many different markets that we will deal with in this course. For example, the stock market is a network of players investing, buying, selling. Thus, if you understand the structure of this market, and how one player influences the other, you will see how these cascaded events influence the price of the stock. Another example is auction theory, which is the subject of the nobel prize in economics this year. Auction theory is what we use when we invest in the stock markets. This is a broad and very deep topic that we study using network analytics. We will also learn how to use networks to maximize benefits when we are competing for some advertisement spots in some social media or technological platforms. If you want to put some advertisements on Google search pages, when you type a query, for example NYU, what kind of ads are going to show up when you type NYU also depends upon the interactions of many factors. We will learn how to maximize revenue for the ads placement, and how your ads can attract the most number of users. There are other examples that’s related to market effectiveness and market efficiency that we will see in this class.  

Q: Network analytics focuses on how business can extract information for optimal results. Can the organizations apply network analytics to assist consumers make optimal decisions?

Whenever you buy something online, the websites will show you the related products. The way Amazon, Taobao and all these platforms do that is they employ social networks and also broader networks. They are going to see whenever somebody buys product A, they also buy product B. So in a way, these two products are connected. So you can form a network of products, we call it a co-purchase network. And also you can analyze networks of people. If two people always buy similar items, I can say they are similar, and may be interested in similar products. So the recommendation systems, the ones you interact on the web, are based on social network analysis. There are algorithms we are going to study in the class, they are very powerful. You’ll see that not only does it help consumers discover what they want to buy, but also helps these companies increase revenue.

Q: Which programming language do we usually use for network analytics?

In this course we primarily use Python, since it’s a very flexible language. Using Python, you will have the tools to extract data from everywhere, from a database, from the web, by extracting information from web pages, etc. Python also allows you to restructure the data for analysis, and, more importantly, it implements powerful data structures and algorithms for network analytics. Having said that, you can use any language to analyze networks. Another popular option is to use the R programming language. There’s a package in R called Igraph, which has all the subroutines to analyze the networks. When speaking of implementing algorithms for production, the C++ language will be used, as it is faster and closer to the hardware, making the computation more efficient. Regardless of the language, what you will learn in this course are the principles, so the transition between the languages will be quite straightforward. But, for the purposes of exercises in the class, we are going to use Python.