Sreenivas Gollapudi

Sreenivas Gollapudi
Microsoft · Research

About

101
Publications
15,195
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,894
Citations

Publications

Publications (101)
Preprint
Full-text available
With the continuous advancement of large language models (LLMs), it is essential to create new benchmarks to effectively evaluate their expanding capabilities and identify areas for improvement. This work focuses on multi-image reasoning, an emerging capability in state-of-the-art LLMs. We introduce ReMI, a dataset designed to assess LLMs' ability...
Article
Historically, much of machine learning research has focused on the performance of the algorithm alone, but recently more attention has been focused on optimizing joint human-algorithm performance. Here, we analyze a specific type of human-algorithm collaboration where the algorithm has access to a set of n items, and presents a subset of size k to...
Preprint
Full-text available
Historically, much of machine learning research has focused on the performance of the algorithm alone, but recently more attention has been focused on optimizing joint human-algorithm performance. Here, we analyze a specific type of human-algorithm collaboration where the algorithm has access to a set of $n$ items, and presents a subset of size $k$...
Preprint
For traffic routing platforms, the choice of which route to recommend to a user depends on the congestion on these routes -- indeed, an individual's utility depends on the number of people using the recommended route at that instance. Motivated by this, we introduce the problem of Congested Bandits where each arm's reward is allowed to depend on th...
Preprint
We consider the classic online learning and stochastic multi-armed bandit (MAB) problems, when at each step, the online policy can probe and find out which of a small number ($k$) of choices has better reward (or loss) before making its choice. In this model, we derive algorithms whose regret bounds have exponentially better dependence on the time...
Article
Generating alternative routes in road networks is an application of significant interest for online navigation systems. A high quality set of diverse alternate routes offers two functionalities - a) support multiple (unknown) preferences that the user may have; and b) robust to changes in network conditions. We formulate a new quantification of the...
Preprint
Graph Neural Networks (GNNs) have emerged as a powerful technique for learning on relational data. Owing to the relatively limited number of message passing steps they perform -- and hence a smaller receptive field -- there has been significant interest in improving their expressivity by incorporating structural aspects of the underlying graph. In...
Chapter
Motivated by the emergence of popular service-based two-sided markets where sellers can serve multiple buyers at the same time, we formulate and study the two-sided cost sharing problem. In two-sided cost sharing, sellers incur different costs for serving different subsets of buyers and buyers have different values for being served by different sel...
Preprint
Full-text available
We consider the following variant of contextual linear bandits motivated by routing applications in navigational engines and recommendation systems. We wish to learn a hidden $d$-dimensional value $w^*$. Every round, we are presented with a subset $\mathcal{X}_t \subseteq \mathbb{R}^d$ of possible actions. If we choose (i.e. recommend to the user)...
Chapter
A two-sided market consists of two sets of agents, each of whom have preferences over the other (Airbnb, Upwork, Lyft, Uber, etc.). We propose and analyze a repeated matching problem, where some set of matches occur on each time step, and our goal is to ensure fairness with respect to the cumulative allocations over an infinite time horizon. Our ma...
Preprint
A two-sided market consists of two sets of agents, each of whom have preferences over the other (Airbnb, Upwork, Lyft, Uber, etc.). We propose and analyze a repeated matching problem, where some set of matches occur on each time step, and our goal is to ensure fairness with respect to the cumulative allocations over an infinite time horizon. Our ma...
Article
We consider the problem of selling perishable items to a stream of buyers in order to maximize social welfare. A seller starts with a set of identical items, and each arriving buyer wants any one item, and has a valuation drawn i.i.d. from a known distribution. Each item, however, disappears after an a priori unknown amount of time that we term the...
Article
We consider the problem of selling perishable items to a stream of buyers in order to maximize social welfare. A seller starts with a set of identical items, and each arriving buyer wants any one item, and has a valuation drawn i.i.d. from a known distribution. Each item, however, disappears after an a priori unknown amount of time that we term the...
Preprint
We consider the problem of selling perishable items to a stream of buyers in order to maximize social welfare. A seller starts with a set of identical items, and each arriving buyer wants any one item, and has a valuation drawn i.i.d. from a known distribution. Each item, however, disappears after an a priori unknown amount of time that we term the...
Article
A core tension in the operations of online marketplaces is between segmentation (wherein platforms can increase revenue by segmenting the market into ever smaller sub-markets) and thickness (wherein the size of the sub-market affects the utility experienced by an agent). An important example of this is in dynamic online marketplaces, where buyers a...
Conference Paper
Recent years have witnessed the rise of many successful e-commerce marketplace platforms like AirBnB, Uber/Lyft, and Upwork, where a central platform mediates economic transactions between buyers and sellers. Some common features that distinguish such marketplaces from more traditional marketplaces are search and discovery of the service providers...
Article
In recent years, a range of online applications have facilitated resource sharing among users, resulting in a significant increase in resource utilization. In all such applications, sharing one’s resources or skills with other agents increases social welfare. In general, each agent will look for other agents whose available resources complement her...
Preprint
In this paper we study the learnability of deep random networks from both theoretical and practical points of view. On the theoretical front, we show that the learnability of random deep networks with sign activation drops exponentially with its depth. On the practical front, we find that the learnability drops sharply with depth even with the stat...
Article
A core tension in the operations of online marketplaces is between segmentation (wherein platforms can increase revenue by segmenting the market into ever smaller sub-markets) and thickness (wherein the size of the sub-market affects the utility experienced by an agent). An important example of this is in dynamic online marketplaces, where buyers a...
Preprint
Motivated by the emergence of popular service-based two-sided markets where sellers can serve multiple buyers at the same time, we formulate and study the {\em two-sided cost sharing} problem. In two-sided cost sharing, sellers incur different costs for serving different subsets of buyers and buyers have different values for being served by differe...
Conference Paper
Motivated by the popularity of online ride and delivery services, we study natural variants of classical multi-vehicle minimum latency problems where the objective is to route a set of vehicles located at depots to serve requests located on a metric space so as to minimize the total latency. In this paper, we consider point-to-point requests that c...
Article
Motivated by the popularity of online ride and delivery services, we study natural variants of classical multi-vehicle minimum latency problems where the objective is to route a set of vehicles located at depots to serve request located on a metric space so as to minimize the total latency. In this paper, we consider point-to-point requests that co...
Conference Paper
We study the problem of automatically and efficiently generating itineraries for users who are on vacation. We focus on the common case, wherein the trip duration is more than a single day. Previous efficient algorithms based on greedy heuristics suffer from two problems. First, the itineraries are often unbalanced, with excellent days visiting top...
Conference Paper
The rapid growth of the Internet has led to the widespread use of newer and richer models of online shopping and delivery services. The race to efficient large scale on-demand delivery has transformed such services into complex networks of shoppers (typically working in the stores), stores, and consumers. The efficiency of processing orders in stor...
Article
We consider the problem of approximating a given matrix by a low-rank matrix so as to minimize the entrywise $\ell_p$-approximation error, for any $p \geq 1$; the case $p = 2$ is the classical SVD problem. We obtain the first provably good approximation algorithms for this version of low-rank approximation that work for every value of $p \geq 1$, i...
Conference Paper
Many qualitative studies of communication practices on social media have recognized that people's motivation for participating in social networks can vary greatly. Some people participate for fame and fortune, while others simply wish to chat with friends. In this paper, we study the implications of such heterogeneous intent for modeling informatio...
Conference Paper
The rapid proliferation of hand-held devices has led to the development of rich, interactive and immersive applications, such as e-readers for electronic books. These applications motivate retrieval systems that can implicitly satisfy any information need of the reader by exploiting the context of the user's interactions. Such retrieval systems dif...
Conference Paper
We propose a system for mining videos from the web for supplementing the content of electronic textbooks in order to enhance their utility. Textbooks are generally organized into sections such that each section explains very few concepts and every concept is primarily explained in one section. Building upon these principles from the education liter...
Patent
Full-text available
Sketches are generated for each node in a graph. For undirected graphs, each sketch for a node may include an indicator of a node from a seed set of nodes and the shortest distance between the node and the indicated node. When a request is received for the shortest distance between two nodes of the graph, the sketches for each of the two nodes are...
Conference Paper
Our opinions and judgments are increasingly shaped by what we read on social media -- whether they be tweets and posts in social networks, blog posts, or review boards. These opinions could be about topics such as consumer products, politics, life style, or celebrities. Understanding how users in a network update opinions based on their neighbor's...
Patent
A tree structure has a node associated with each category of a hierarchy of item categories. Child nodes of the tree are associated with sub-categories of the categories associated with parent nodes. Training data including received queries and indicators of a selected item category for each received query is combined with the tree structure by ass...
Patent
As provided herein, a pairwise distance between nodes in a large graph can be determined efficiently. URL-sketches are generated for respective nodes in an index by extracting labels from respective nodes, which provide a reference to a link between the nodes, aggregating the labels into sets for respective nodes, and storing the sets of labels as...
Article
We present study navigator, an algorithmically-generated aid for enhancing the experience of studying from electronic textbooks. The study navigator for a section of the book consists of helpful concept references for understanding this section. Each concept reference is a pair consisting of a concept phrase explained elsewhere and the link to the...
Conference Paper
With the explosive growth of social networks, many applications are increasingly harnessing the pulse of online crowds for a variety of tasks such as marketing, advertising, and opinion mining. An important example is the wisdom of crowd effect that has been well studied for such tasks when the crowd is non-interacting. However, these studies don't...
Patent
Full-text available
Documents are replicated among servers comprising a search engine based on the value of each document by approximating its value as one of the top search results for one or more exemplary queries. Documents are allocated among servers comprising a search engine by calculating a relevance value for each document and then distributing the documents e...
Conference Paper
We present game-theoretic models of opinion formation in social networks where opinions themselves co-evolve with friendships. In these models, nodes form their opinions by maximizing agreements with friends weighted by the strength of the relationships, which in turn depend on difference in opinion with the respective friends. We define a social c...
Conference Paper
A search engine aims to return a set of relevant documents in response to a query, while minimizing the response time. This has led to the use of a tiered index, where the search engine maintains a small cache of documents that can serve a large fraction of queries. We give a novel algorithm for the selection of documents in a tiered index for comm...
Conference Paper
A user’s session of information need often goes well beyond his search query and first click on the search result page and therefore is characterized by both search and browse activities on the web. In such settings, the effectiveness of an ad (measured as CtoC ratio, as well as #(conversions) per unit payment) could change based on what pages the...
Article
Full-text available
We show that the multiplicative weight update method provides a simple recipe for designing and analyzing optimal Bayesian Incentive Compatible (BIC) auctions, and reduces the time complexity of the problem to polynomial in parameters that depend on single agent instead of on the joint type space. We use this framework to design the first computati...
Conference Paper
Education is known to be the key determinant of economic growth and prosperity [9, 13]. While the issues in devising a high-quality educational system are multi-faceted and complex, textbooks are acknowledged to be the educational input most consistently associated with gains in student learning [12]. Particularly in developing regions, they are th...
Conference Paper
Full-text available
Education is known to be the key determinant of economic growth and prosperity [8,12]. While the issues in devising a high-quality educational system are multi-faceted and complex, textbooks are acknowledged to be the educational input most consistently associated with gains in student learning [11]. They are the primary conduits for delivering con...
Article
Full-text available
Recent work in commerce search has shown that understanding the semantics in user queries enables more effective query analysis and retrieval of relevant products. However, due to lack of sufficient domain knowledge, user queries often include terms that cannot be mapped directly to any product attribute. For example, a user looking for designer ha...
Article
Full-text available
Good textbooks are organized in a systematically progressive fashion so that students acquire new knowledge and learn new concepts based on known items of information. We provide a diagnostic tool for quantitatively assessing the comprehension burden that a textbook imposes on the reader due to non-sequential presentation of concepts. We present a...
Article
With the advent of social networks such as Facebook and LinkedIn, and online offers/deals web sites, network externalties raise the possibility of marketing and advertising to users based on influence they derive from their neighbors in such networks. Indeed, a user's knowledge of which of his neighbors "liked" the product, changes his valuation fo...
Conference Paper
Full-text available
Many phenomena and artifacts such as road networks, social networks and the web can be modeled as large graphs and analyzed using graph algorithms. However, given the size of the underlying graphs, efficient implementation of basic operations such as connected component analysis, approximate shortest paths, and link-based ranking (e.g. PageRank) be...
Article
Full-text available
We present our early explorations into developing a data mining based approach for enhancing the quality of textbooks. We describe a diagnostic tool to algorithmically identify deficient sections in textbooks. We also discuss techniques for algorithmically augmenting textbook sections with links to selective content mined from the Web. Our evaluati...
Article
Full-text available
Textbooks are the educational input most consistently associated with gains in student learning. Particularly in developing countries, textbooks are the primary conduits for delivering content knowledge to the students and the teachers base their lesson plans on the material given in textbooks. Abstracting from the education literature, we propose...
Conference Paper
Full-text available
Motivated by trends in popularity of products, we present a formal model for studying trends in our choice of products in terms of three parameters: (1) their innate utility; (2) individual boredom associated with repeated usage of an item; and (3) social influences associated with the preferences from other people. Different from previous work, in...
Conference Paper
Textbooks have a direct bearing on the quality of education imparted to the students. Therefore, it is of paramount importance that the educational content of textbooks should provide rich learning experience to the students. Recent studies on understanding learning behavior suggest that the incorporation of digital visual material can greatly enha...
Article
Web search engines and specialized online verticals are increasingly incorporating results from structured data sources to answer semantically rich user queries. For example, the query \WebQuery{Samsung 50 inch led tv} can be answered using information from a table of television data. However, the users are not domain experts and quite often enter...
Conference Paper
Full-text available
Large web search engines process billions of queries each day over tens of billions of documents with often very stringent requirements for a user's search experience, in particular, low latency and highly relevant search results. Index generation and serving are key to satisfying both these requirements. For example, the load to search engines can...
Conference Paper
Full-text available
Commerce search engines have become popular in recent years, as users increasingly search for (and buy) products on the web. In response to an user query, they surface links to products in their catalog (or index) that match the requirements specified in the query. Often, few or no product in the catalog matches the user query exactly, and the sear...
Article
This article focuses on computations on large graphs (e.g., the web-graph) where the edges of the graph are presented as a stream. The objective in the streaming model is to use small amount of memory (preferably sub-linear in the number of nodes n ) and a smaller number of passes. In the streaming model, we show how to perform several graph comput...
Conference Paper
Many textbooks written in emerging countries lack clear and adequate coverage of important concepts. We propose a technological solution for algorithmically identifying those sections of a book that are not well written and could benefit from better exposition. We provide a decision model based on the syntactic complexity of writing and the dispers...
Conference Paper
In commerce search, the set of products returned by a search engine often forms the basis for all user interactions leading up to a potential transaction on the web. Such a set of products is known as the consideration set. In this study, we consider the problem of generating consideration set of products in commerce search so as to maximize user s...
Conference Paper
Full-text available
Recommendation engines today suggest one product to another, e.g., an accessory to a product. However, intent to buy often precedes a user's appearance in a commerce vertical: someone interested in buying a skateboard may have earlier searched for {varial heelflip}, a trick performed on a skateboard. This paper considers how a search engine can pro...
Conference Paper
Full-text available
Education is acknowledged to be the primary vehicle for improving the economic well-being of people [1,6]. Textbooks have a direct bearing on the quality of education imparted to the students as they are the primary conduits for delivering content knowledge [9]. They are also indispensable for fostering teacher learning and constitute a key compone...
Article
Full-text available
Textbooks play an important role in any educational sys-tem. Unfortunately, many textbooks produced in developing countries are not written well and they often lack adequate coverage of important concepts. We propose a technological solution to address this problem based on enriching text-books with authoritative web content. We augment text-books...
Article
Full-text available
We present a formal model for studying fashion trends, in terms of three parameters of fashionable items: (1) their innate utility; (2) individual boredom associated with repeated usage of an item; and (3) social influences associated with the preferences from other people. While there are several works that emphasize the effect of social influence...
Conference Paper
Full-text available
We study the fundamental problem of computing distances between nodes in large graphs such as the web graph and social networks. Our objective is to be able to answer dis- tance queries between pairs of nodes in real time. Since the standard shortest path algorithms are expensive, our approach moves the time-consuming shortest-path compu- tation of...
Article
Full-text available
Most learning to rank research has assumed that the utility of different documents is independent, which results in learned ranking functions that return redundant results. The few approaches that avoid this have rather unsatisfyingly lacked theoretical foundations, or do not scale. We present a learning-to-rank formulation that optimizes the fract...
Article
Full-text available
Click through rates (CTR) offer useful user feedback that can be used to infer the relevance of search results for queries. However it is not very meaningful to look at the raw click through rate of a search result because the likelihood of a result being clicked depends not only on its relevance but also the position in which it is displayed. One...
Conference Paper
Full-text available
We study the problem of designing a mechanism to rank items in forums by making use of the user reviews such as thumb and star ratings. We compare mechanisms where fo- rum users rate individual posts and also mechanisms where the user is asked to perform a pairwise comparison and state which one is better. The main metric used to evaluate a mechani...
Conference Paper
Finding sparse cuts is an important tool for analyzing large graphs that arise in practice, such as the web graph, online social communities, and VLSI circuits. When dealing with such graphs having billions of nodes, it is often hard to visualize global partitions. While studies on sparse cuts have traditionally looked at cuts with respect to all t...
Article
Full-text available
In this paper, we present the first approximation algorithms for the problem of designing revenue optimal Bayesian incentive compatible auctions when there are multiple (heterogeneous) items and when bidders can have arbitrary demand and budget constraints. Our mechanisms are surprisingly simple: We show that a sequential all-pay mechanism is a 4 a...
Conference Paper
Understanding user intent is key to designing an effective ranking system in a search engine. In the absence of any explicit knowledge of user intent, search engines want to diversify results to improve user satisfaction. In such a setting, the probability ranking principle-based approach of presenting the most relevant results on top can be sub-op...
Conference Paper
Full-text available
In this paper, we attempt to improve the effectiveness and the efficiency of query-dependent link-based ranking algo- rithms such as HITS, MAX and SALSA. All these ranking algorithms view the results of a query as nodes in the web graph, expand the result set to include neighboring nodes, and compute scores on the induced neighborhood graph. In pre...
Conference Paper
We study the problem of answering ambiguous web queries in a setting where there exists a taxonomy of information, and that both queries and documents may belong to more than one category according to this taxonomy. We present a systematic approach to diversifying results that aims to minimize the risk of dissatisfaction of the average user. We pro...
Conference Paper
Full-text available
We introduce a new approach to analyzing click logs by ex- amining both the documents that are clicked and those that are bypassed—documents returned higher in the ordering of the search results but skipped by the user. This approach complements the popular click-through rate analysis, and helps to draw negative inferences in the click logs. We for...
Conference Paper
Full-text available
In this study we propose sketching algorithms for com- puting similarities between hierarchical data. Specifi- cally, we look at data objects that are represented us- ing leaf-labeled trees denoting a set of elements at the leaves organized in a hierarchy. Such representations are richer alternatives to a set. For example, a docu- ment can be repre...
Conference Paper
Motivated by the growth of various networked systems as potential market places, we study market models wherein, owing to the size of the markets, transactions take place between largely unknown agents. In such scenarios, intermediaries or brokers play a significant role in a transaction. We analyze market behavior in large networks wherein all sel...
Conference Paper
This paper describes a technique for reducing the query- time cost of HITS-like ranking algorithm. The basic idea is to compute for each node in the web graph a summary of its immediate neighbor- hood (which is a query-independent operation and thus can be done off-line), and to approximate the neighborhood graph of a result set at query-time by co...
Conference Paper
Full-text available
We initiate a novel study of clustering problems. Rather than specifying an explicit objective function to optimize, our framework allows the user of clustering algorithm to specify, via a first-order formula, what constitutes an acceptable clustering to them. While the resulting genre of problems includes, in general, NP-complete problems, we high...
Conference Paper
Full-text available
In this paper we propose a dictionary data structure for string search with errors where the query string may didiffer from the expected matching string by a few edits. This data structure can also be used to find the database string with the longest common prefix with few errors. Specifically, with a database of n random strings, each of length of...
Conference Paper
Full-text available
Topic or feature extraction is often used as an important step in document classification and text mining. Topics are succinct representation of content in a document collection and hence are very effective when used as content identifiers in peer-to-peer systems and other large scale distributed content management systems. Effective topic extracti...
Conference Paper
We propose a novel mechanism for routing and bandwidth allocation that exploits the selfish and rational behavior of flows in a network. Our mechanism leads to allocations that simultaneously optimize throughput and fairness criteria. We analyze the performance of our mechanism in terms of the induced Nash equilibrium. We compare the allocations at...
Conference Paper
Full-text available
Mining massive temporal data streams for significant trends, emerging buzz, and unusually high or low activity is an important problem with several commercial applications. In this paper, we propose a framework based on relational records and metric spaces to study such problems. Our framework provides the necessary mathematical underpinnings for t...
Conference Paper
We propose an efficient and scalable scheme for bandwidth reservation and monitoring. Our scheme is based on a reserve-and-refresh strategy [I. Stoica and Hui Zhang, 19994], [S. Machiraju et al., 2002], where each flow is periodically refreshed in its initial reservation. We propose novel algorithms to handle various forms of misbehavior, e.g., att...
Conference Paper
Equitable bandwidth allocation is essential when QoS requirements and purchasing power vary among users. To this end, we present a mechanism for bandwidth allocation based on differential pricing. In our model, the QoS vs. cost trade-off induces a minimum acceptable allocation, a maximum acceptable allocation, and a unique optimal allocation for ea...
Chapter
Advances in multimedia computing technologies offer new approaches to the support of computer-assisted education and training within many application domains. Novel interactive presentation tools can be built to enhance traditional teaching methods with more active learning. Since a variety of user expectations are possible in such an environment,...
Article
. The delivery of continuous and synchronous multimedia data from a database (or file) server to multiple destinations over a network presents new challenges in the area of buffer management. Many factors that were not considered in conventional buffer management must be examined. In this paper, we investigate the principles of buffer model and man...
Article
Advances in multimedia computing technologies offer new approaches to the support of computerassisted education and training within many application domains. Novel interactive presentation tools can be built to enhance traditional teaching methods with more active learning. Since a variety of user expectations are possible in such an environment, r...
Article
The extension of database systems to support multimedia applications requires new mechanisms to ensure the synchronized presentation of multiple media data streams. In order to flexibly and efficiently present multimedia data streams to users, media streams must be segmented into media objects and time constraints among these objects must be specif...
Conference Paper
Advances in multimedia computing technologies offer new approaches to support on-line accesses to information from a variety of sources such as video clips, audio, images, and books. A client-server distributed multimedia system would be a practical approach to support such functionalities. We present the design and implementation of a client-serve...
Article
Advances in multimedia computing technologies offer new approaches to support on-line accesses to information from a variety of sources such as video clips, audio, images, and books. A clientserver distributed multimedia system would be a practical approach to support such functionalities. In this thesis, we present the design and implementation of...
Conference Paper
Investigates the principles of buffer management for multimedia data presentations in object-oriented database environments. The primary goal is to minimize the response time of multimedia presentations while ensuring that all continuity and synchronization requirements are satisfied. Minimum buffering requirements to guarantee the continuity and s...
Article
Advances in multimedia computing technologies offer new approaches to support on-line accesses to information from a variety of sources such as video clips, audio, images, and books. A client-server distributed multimedia system would be a practical approach to support such functionalities. In this paper, we present the design and implementation of...
Article
Current database management system techniques are insufficient to support the management of multimedia data owing to their time-sampled nature. The extension of database systems to support multimedia applications thus requires new mechanisms to ensure the synchronized presentation of multimedia data streams. In order to flexibly and efficiently pre...
Article
Full-text available
Click logs provide valuable information that can be used to infer several parameters related to the relevance of search results and ads to queries. However often there are much fewer clicks on ads as compared to search results. Thus while the click logs can easily be used to study parameters related to search results the sparseness of the number of...
Article
Full-text available
We provide a theoretical model to explain the evolution of commu-nities in social networks where new friends are formed by looking at friends of friends. Specifically, a new friendship link is formed based on the number of common friends between the two nodes. Additionally, we also allow a fraction of the new links to be ran-dom. Furthermore, we al...

Network

Cited By