文档结构  
翻译进度:55%     翻译赏金:0 元 (?)    ¥ 我要打赏

Kristian Hammond  Kris的首席科学家和共同创始人之一, 专注于对Narrative Science的科学研究。 Kris也是Northwestern University的计算机科学教授。

我们倾向于认为机器,特别是智能机器,冷酷而公证地进行计算,没有看法和偏见。我们相信自动驾驶汽车在救路人还是救司机上不会偏袒司机。 我们相信个人信用评估系统不会主观考虑而只通过客观资料来评估,比如收入和FICO积分。我们相信自学习系统将永远得出基于事实的客观结论(而不像人类会先入为主),因为驱动它们的是不带主观看法的算法。

第 1 段(可获 1.28 积分)

对于一些人来说,这样的机器是有bug的: <Machines should not be empathetic outside of their rigid point of view> (他们坚持认为机器需要有怜悯心,设计出没有怜悯心的机器显然违背了他们的观念). 对于另一些人来说,没有主观意识的机器是机器的一大特点: 机器不该人类的偏见左右. 在这两类人之间,还有一种看法, 它们可以有主观的思想,但应做出客观的判断

当然,真相远不止如此。 事实上不但不带偏见/偏差的智能系统凤毛麟角,产生偏差的因素也非常多。这些来源我们用来训练系统的数据, 如,训练系统的数据导致的误差,交互误差,突发情况引起的误差,相似性误差和目标冲突导致的误差等。这些产生偏差的因素绝大多数都被人们忽视了。我们建造和部署智能系统的时候,我们需要理解这些产生偏差的因素,这样才能在设计中保持对这些因素的注意,也许就能避免产生不是那么明显的问题。

第 2 段(可获 1.65 积分)

数据驱动的偏见

对于任何学习的系统来说, 输出是由其接收的数据来决定的。这不是一个新的见解,当我们看到由数以百万计的实例所驱动的系统时,它往往会被遗忘。 该想法是,大量的例子将压倒任何人类的偏见。但如果训练本身就不准确,结果将是同样如此。

最近,这种偏见已经出现在通过深入学习的图像识别系统中。尼康对亚洲面孔的混淆和惠普面部识别软件中的肤色问题似乎都是从不准确的实例集中学习而成的产品。 虽然两者都是可解决的、绝对非故意的,但当我们没留意到数据中的偏差时,它们会显示出可能出现的问题.。

第 3 段(可获 1.74 积分)

Beyond facial recognition, there are other troubling instances with real-world implications.  Learning systems used to build the rules sets applied to predict recidivism rates for parolees, crime patterns or potential employees are areas with potentially negative repercussions. When they are trained using skewed data, or even data that is balanced but the systems are biased in decision-making, they will perpetuate the bias, as well.

Bias through interaction

While some systems learn by looking at a set of examples in bulk, other sorts of systems learn through interaction. Bias arises based on the biases of the users driving the interaction. A clear example of this bias is Microsoft’s Tay, a Twitter-based chatbot designed to learn from its interactions with users. Unfortunately, Tay was influenced by a user community that taught Tay to be racist and misogynistic. In essence, the community repeatedly tweeted offensive statements at Tay and the system used those statements as grist for later responses.

第 4 段(可获 1.96 积分)

Tay仅仅上线了24小时就变成了一个相当激进的种族歧视份子而被微软下线。虽然Tay对人们不友善的种族歧视咆哮仅限于在Twitter交际圈子里散播开来,但推广开来,这也许潜在暗示了真实世界里人们的态度。我们建造智能系统和人类作伴,让他们和人类一起学习、做出决策,这次糟糕的训练问题也许会发酵成更严重的大事件。

如果我们用专门的导师(而不是所有人)来指导训练智能系统呢?想一想,我们对机器指导决定谁得到借款,甚至谁有假使资格并不是那么信任(因为说不定机器训练/编程的人悄悄植入了一些个人观点/后门)。 Tay给我们的教训是这类带有学习功能的系统会从其所处的环境和用户身上得到偏见/偏差,代表了系统训练者的个人观点,不管是好的还是坏的。

第 5 段(可获 1.6 积分)

突发的偏见

有时,针对个性化的系统做出的决定最终会造成我们周围的偏见“泡沫”.。 我们可以看看脸谱网目前的状态,看看这种偏见发挥的效果。 在置顶层的消息中,脸谱网用户看到他们的朋友的帖子,并可以与他们分享信息。

不幸的是,任何使用数据馈送分析然后呈现其他内容的算法都将提供与用户已经看到的想法集相匹配的内容。 这种效果被放大为用户打开, 喜欢和分享内容。其结果是一个流向用户的现有信念集的信息流。

第 6 段(可获 1.38 积分)

While it is certainly personalized, and often reassuring, it is no longer what we would tend to think of as news. It is a bubble of information that is an algorithmic version of “confirmation bias.” Users don’t have to shield themselves from information that conflicts with their beliefs because the system is automatically doing it for them.

In an ideal world, intelligent systems and their algorithms would be objective.

The impact of these information biases on the world of news is troubling. But as we look to social media models as a way to support decision making in the enterprise, systems that support the emergence of information bubbles have the potential to skew our thinking. A knowledge worker who is only getting information from the people who think like him or her will never see contrasting points of view and will tend to ignore and deny alternatives.

第 7 段(可获 1.85 积分)

Similarity bias

Sometimes bias is simply the product of systems doing what they were designed to do. Google News, for example, is designed to provide stories that match user queries with a set of related stories. This is explicitly what it was designed to do and it does it well. Of course, the result is a set of similar stories that tend to confirm and corroborate each other. That is, they define a bubble of information that is similar to the personalization bubble associated with Facebook.

There are certainly issues related to the role of news and its dissemination highlighted by this model — the most apparent one being a balanced approach to information. The lack of “editorial control” scopes across a wide range of situations. While similarity is a powerful metric in the world of information, it is by no means the only one. Different points of view provide powerful support for decision making. Information systems that only provide results “similar to” either queries or existing documents create a bubble of their own.

第 8 段(可获 2.16 积分)

相似性偏差是一个往往易被接受, 包含契约的概念, 对立甚至冲突的支持创新和创造力的观点 特别是在企业中。

冲突的目标偏差

有时,专为特定商业目的而设计的系统最终会产生偏差,这些偏差是真实的,但完全是无法预见的.。

想象一个系统,例如,旨在为潜在候选人提供工作描述。 当用户点击工作描述时系统会产生收入.。 所以自然的算法的目标是提供的工作描述,得到的点击次数最多。

第 9 段(可获 1.24 积分)

As it turns out, people tend to click on jobs that fit their self-view, and that view can be reinforced in the direction of a stereotype by simply presenting it. For example, women presented with jobs labeled as “Nursing” rather than “Medical Technician” will tend toward the first. Not because the jobs are best for them but because they are reminded of the stereotype, and then align themselves with it.

The impact of stereotype threat on behavior is such that the presentation of jobs that fit an individual’s knowledge of a stereotype associated with them (e.g. gender, race, ethnicity) leads to greater clicks. As a result, any site that has a learning component based on click-through behavior will tend to drift in the direction of presenting opportunities that reinforce stereotypes.

第 10 段(可获 1.65 积分)

Machine bias is human bias

In an ideal world, intelligent systems and their algorithms would be objective. Unfortunately, these systems are built by us and, as a result, end up reflecting our biases. By understanding the bias themselves and the source of the problems, we can actively design systems to avoid them.

Perhaps we will never be able to create systems and tools that are perfectly objective, but at least they will be less biased than we are. Then perhaps elections wouldn’t blindside us, currencies wouldn’t crash and we could find ourselves communicating with people outside of our personalized news bubbles.

Featured Image: Bryce Durbin/TechCrunch

第 11 段(可获 1.35 积分)

文章评论