Training a robotic limb is a complex process of co-adaptation that requires the limb to learn how to cooperate with the human brain controlling most of the body. That process can involve much initial clumsiness: not unlike the first time people strap skis onto their feet and try to move around on a snow-packed surface.
IBM’s new open-source AI Fairness 360 toolkit claims both to check for and to mitigate bias in AI models, allowing an AI algorithm to explain its own decision-making. This collection of metrics may allow researchers and enterprise AI architects to cast the revealing light of transparency into ‘black box’ AI algorithms.
Technology helps people augment their abilities. And, from the Gutenberg Bible to robotics, tech has always had ethical implications. But while many technologies have narrow implications, data touches everything. It is us. Data is where the rubber of humanity meets the road of technology – and we’re ill-prepared for the impact. From data breaches, to campaign influence, to fraud, to equality, data ethics are at the forefront of today’s headlines. In this day-long Strata Data event, Altimeter analyst, Susan Etlinger, and Strata Chair, Alistair Croll, bring together a packed lineup of academics, practitioners, and innovators for a deep dive into the thorny issues of data and algorithms, and how establishing and reinforcing ethical techonology norms can not only mitigate risk, but drive innovation.
Humanity is reaching an inflection point in terms of its technological development. At the same time, we are reevaluating our place on Earth and rethinking how to build a fairer society. Can artificial intelligence (AI, machine learning, statistical learning or however you want to call it) serve to tackle societal and environmental challenges? Yes, absolutely. In fact, the same algorithms used to recommend products on an e-commerce website, or choose the ads shown to you, can be applied to solve real human problems. All data scientists, from aspiring ones to researchers, have the opportunity (and even the responsibility) to take advantage of the current data revolution to improve our world. But let’s be clear, this is a complex endeavour, which requires cross-disciplinary and multi-sector collaborations involving governments, NGOs, private organizations and academia. Otherwise, it will be too easy to fall for the AI hype instead of understanding it as a tool to augment our abilities. Only solutions backed up by research and deep understanding of the respective problem domain will be robust (remember the scientific method) and effective in the long term. It requires cross-disciplinary and multi-sector collaborations, involving governments, NGOs, private organizations and academia, to use AI for good.
We present a large-scale study of gender bias in occupation classification, a task where the use of machine learning may lead to negative outcomes on peoples’ lives. We analyze the potential allocation harms that can result from semantic representation bias. To do so, we study the impact on occupation classification of including explicit gender indicators—such as first names and pronouns—in different semantic representations of online biographies. Additionally, we quantify the bias that remains when these indicators are ‘scrubbed,’ and describe proxy behavior that occurs in the absence of explicit gender indicators. As we demonstrate, differences in true positive rates between genders are correlated with existing gender imbalances in occupations, which may compound these imbalances.
While harms of allocation have been increasingly studied as part of the subfield of algorithmic fairness, harms of representation have received considerably less attention. In this paper, we formalize two notions of stereotyping and show how they manifest in later allocative harms within the machine learning pipeline. We also propose mitigation strategies and demonstrate their effectiveness on synthetic datasets.
Paper: Is Privacy Controllable?
One of the major views of privacy associates privacy with the control over information. This gives rise to the question how controllable privacy actually is. In this paper, we adapt certain formal methods of control theory and investigate the implications of a control theoretic analysis of privacy. We look at how control and feedback mechanisms have been studied in the privacy literature. Relying on the control theoretic framework, we develop a simplistic conceptual control model of privacy, formulate privacy controllability issues and suggest directions for possible research.
As machine learning increasingly affects people and society, it is important that we strive for a comprehensive and unified understanding of how and why unwanted consequences arise. For instance, downstream harms to particular groups are often blamed on ‘biased data,’ but this concept encompass too many issues to be useful in developing solutions. In this paper, we provide a framework that partitions sources of downstream harm in machine learning into five distinct categories spanning the data generation and machine learning pipeline. We describe how these issues arise, how they are relevant to particular applications, and how they motivate different solutions. In doing so, we aim to facilitate the development of solutions that stem from an understanding of application-specific populations and data generation processes, rather than relying on general claims about what may or may not be ‘fair.’
Humans are increasingly coming into contact with artificial intelligence and machine learning systems. Human-centered artificial intelligence is a perspective on AI and ML that algorithms must be designed with awareness that they are part of a larger system consisting of humans. We lay forth an argument that human-centered artificial intelligence can be broken down into two aspects: (1) AI systems that understand humans from a sociocultural perspective, and (2) AI systems that help humans understand them. We further argue that issues of social responsibility such as fairness, accountability, interpretability, and transparency.