Discover more content...

Discover more content...

Enter some keywords in the search box above, we will do our best to offer you relevant results.


We're sorry!

Sorry about that!

We couldn't find any results for your search. Please try again with another keywords.


一. What is Machine Learning?


  • 1956 年,开发了西洋跳棋 AI 程序的 Arthur Samuel 在标志着人工智能学科诞生的达特茅斯会议上定义了 “机器学习” 这个词,定义为,“在没有明确设置的情况下,使计算机具有学习能力的研究领域”。

  • 1997 年,Tom Mitchell 提供了一个更现代的定义:“如果用 P 来测量程序在任务 T 中性能。若一个程序通过利用经验 E 在 T 任务中获得了性能改善,则我们就说关于任务 T 和 性能测量 P ,该程序对经验 E 进行了学习。”


E = 玩很多盘跳棋游戏的经验

T = 玩跳棋的任务。

P = 程序将赢得下一场比赛的概率。

二. Classify


有监督学习 supervised learning 和无监督学习 unsupervised learning。


1. supervised learning

在监督式学习中,首先有一个数据集,并且已知正确的输出是什么,且输入和输出存在关联。 监督学习问题分为“回归 Regression”和“分类 Classification”问题。



2. unsupervised learning

无监督学习使我们能够很少或根本不知道我们的结果应该是什么样子。我们可以从数据中得出结构,我们不一定知道变量的影响。 我们可以通过基于数据中变量之间的关系对数据进行聚类来推导出这种结构。 在无监督学习的情况下,没有基于预测结果的反馈。无监督学习可以分为“聚类”和“非聚类”。


非聚类:“鸡尾酒会算法”,允许您在混乱的环境中查找结果。 (即在鸡尾酒会上识别来自声音网格的个人声音和音乐)。

三. Review

Question 1

A computer program is said to learn from experience E with respect to some task T and some performance measure P if its performance on T, as measured by P, improves with experience E. Suppose we feed a learning algorithm a lot of historical weather data, and have it learn to predict weather. In this setting, what is E?

The process of the algorithm examining a large amount of historical weather data.

T := The weather prediction task.
P := The probability of it correctly predicting a future date's weather.
E := The process of the algorithm examining a large amount of historical weather data.

Question 2

Suppose you are working on weather prediction, and you would like to predict whether or not it will be raining at 5pm tomorrow. You want to use a learning algorithm for this. Would you treat this as a classification or a regression problem?



Question 3

Suppose you are working on stock market prediction, and you would like to predict the price of a particular stock tomorrow (measured in dollars). You want to use a learning algorithm for this. Would you treat this as a classification or a regression problem?



Question 4

Some of the problems below are best addressed using a supervised learning algorithm, and the others with an unsupervised learning algorithm. Which of the following would you apply supervised learning to? (Select all that apply.) In each case, assume some appropriate dataset is available for your algorithm to learn from.


  1. Take a collection of 1000 essays written on the US Economy, and find a way to automatically group these essays into a small number of groups of essays that are somehow "similar" or "related".


  1. Given a large dataset of medical records from patients suffering from heart disease, try to learn whether there might be different clusters of such patients for which we might tailor separate treatements.


  1. Given genetic (DNA) data from a person, predict the odds of him/her developing diabetes over the next 10 years.


  1. Given 50 articles written by male authors, and 50 articles written by female authors, learn to predict the gender of a new manuscript's author (when the identity of this author is unknown).


  1. In farming, given data on crop yields over the last 50 years, learn to predict next year's crop yields.


  1. Examine a large collection of emails that are known to be spam email, to discover if there are sub-types of spam mail.


  1. Examine a web page, and classify whether the content on the web page should be considered "child friendly" (e.g., non-pornographic, etc.) or "adult."


  1. Examine the statistics of two football teams, and predicting which team will win tomorrow's match (given historical data of teams' wins/losses to learn from).


Question 5

Which of these is a reasonable definition of machine learning?

Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed.

这是开发了西洋跳棋 AI 程序的 Arthur Samuel 给出的定义。

GitHub Repo:Halfrost-Field

Follow: halfrost · GitHub