We have developed a platform for applying new methodologies at the intersection of causal inference and machine learning to problems at scale. We utilize heterogeneous treatment effect algorithms to predict how different users (riders, drivers) will respond to a specific treatment (marketing campaign, product change, etc.). These predictions dramatically increase the degree we can customize the user experience while reducing the operational load of doing so.
This platform lets us better understand how our users vary; optimally target users based on these individual treatment effect predictions; and evaluate the results of these predictions. The platform is built in a flexible way to allow us to plug-and-play with different algorithms, which lets us compare performance and develop improvements. While the scale of our data and complexity of these algorithms requires substantial engineering infrastructure, we have built the platform in a modular way that allows for separation between the science and engineering code. This makes it much easier for data scientists to iterate on these models, as they do not need to worry (as much) about the computational complexities.
Download rates of academic journals have joined citation counts as commonly used indicators of the value of journal subscriptions. While citations reflect worldwide influence, the value of a journal subscription to a single library is more reliably measured by the rate at which it is downloaded by local users. If reported download rates accurately measure local usage, there is a strong case for using them to compare the cost-effectiveness of journal subscriptions. We examine data for nearly 8,000 journals downloaded at the ten universities in the University of California system during a period of six years. We find that controlling for number of articles, publisher, and year of download, the ratio of downloads to citations differs substantially among academic disciplines. After adding academic disciplines to the control variables, there remain substantial “publisher effects”, with some publishers reporting significantly more downloads than would be predicted by the characteristics of their journals. These cross-publisher differences suggest that the currently available download statistics, which are supplied by publishers, are not sufficiently reliable to allow libraries to make subscription decisions based on price and reported downloads, at least without making an adjustment for publisher effects in download reports.
Much of the new “gig economy” relies on reputation systems to reduce problems of asymmetric information. There is evidence that one aspect of these reputation systems, online reviews, provide information to these markets. However, less is known about how these reviews interact and compare with other pieces of information in these markets. This paper provides a more complete picture of the reputation system in an online labor market. I compare the informational content of online reviews with other sources of information about worker ability, including the review comments, standardized exam scores, and the worker’s country. I estimate the effect of each component on wages and worker attrition. Reviews have a relatively small effect on both wages and attrition, however, I am able to separate out the dual role of reviews: rewarding good workers and punishing bad ones. Finally, I investigate why firms leave reviews at all, and find that firm reputation and re-hiring considerations incentivize firms to leave informative reviews.
The ability to estimate peer effects in network models has been advanced considerably by the IV model of Bramoullé et al. (2009). While such IV estimates work well for very sparse networks, they exhibit very weak power for networks of even modest densities. We review and extend the findings of Bramoullé et al. (2009) and then propose an alternative estimator. We show that our new estimator works approximately as well as IV in very sparse networks and performs much better in networks of moderate density.