如何用php实现k近邻算法
K近邻算法是一种简单且常用的机器学习算法,广泛应用于分类和回归问题。它的基本原理是通过计算待分类样本与已知样本之间的距离,将待分类样本归为距离最近的K个已知样本所属的类别。在本文中,我们将介绍如何用php实现k近邻算法,并提供代码示例。
- 数据准备
首先,我们需要准备已知样本数据和待分类样本数据。已知样本数据包含类别和特征值,待分类样本数据只有特征值。为了简化示例,我们假设已知样本数据和待分类样本数据均以数组的形式表示。以下是一个示例数据:
已知样本数据:
$knownSamples = array(
array('class' => 'A', 'features' => array(2, 3)),
array('class' => 'A', 'features' => array(4, 5)),
array('class' => 'B', 'features' => array(1, 1)),
array('class' => 'B', 'features' => array(3, 2)),);
待分类样本数据:
$unknownSample = array('features' => array(2, 2));
立即学习“PHP免费学习笔记(深入)”;
- 计算距离
接下来,我们需要编写一个函数,用于计算待分类样本与已知样本之间的距离。常用的距离度量方法有欧氏距离、曼哈顿距离等。以下是一个计算欧氏距离的示例:
function euclideanDistance($sample1, $sample2) {
$sum = 0;
for ($i = 0; $i < count($sample1); $i++) {
$sum += pow($sample1[$i] - $sample2[$i], 2);
}
return sqrt($sum);}
- 寻找K个最近邻居
在这一步,我们需要编写一个函数,用于寻找距离待分类样本最近的K个已知样本。以下是一个示例函数:
function findNeighbors($knownSamples, $unknownSample, $k) {
$distances = array();
foreach ($knownSamples as $knownSample) {
$distance = euclideanDistance($knownSample['features'], $unknownSample['features']);
$distances[] = array('class' => $knownSample['class'], 'distance' => $distance);
}
usort($distances, function ($a, $b) {
return $a['distance'] - $b['distance'];
});
return array_slice($distances, 0, $k);}
- 进行分类
最后,我们需要编写一个函数,根据K个最近邻居的类别进行分类。以下是一个示例函数:
function classify($neighbors) {
$classes = array();
foreach ($neighbors as $neighbor) {
$classes[] = $neighbor['class'];
}
$classCounts = array_count_values($classes);
arsort($classCounts);
return key($classCounts);}
- 完整示例
以下是一个完整的示例代码:
function euclideanDistance($sample1, $sample2) {
$sum = 0;
for ($i = 0; $i < count($sample1); $i++) {
$sum += pow($sample1[$i] - $sample2[$i], 2);
}
return sqrt($sum);
}
function findNeighbors($knownSamples, $unknownSample, $k) {
$distances = array();
foreach ($knownSamples as $knownSample) {
$distance = euclideanDistance($knownSample['features'], $unknownSample['features']);
$distances[] = array('class' => $knownSample['class'], 'distance' => $distance);
}
usort($distances, function ($a, $b) {
return $a['distance'] - $b['distance'];
});
return array_slice($distances, 0, $k);
}
function classify($neighbors) {
$classes = array();
foreach ($neighbors as $neighbor) {
$classes[] = $neighbor['class'];
}
$classCounts = array_count_values($classes);
arsort($classCounts);
return key($classCounts);
}
$knownSamples = array(
array('class' => 'A', 'features' => array(2, 3)),
array('class' => 'A', 'features' => array(4, 5)),
array('class' => 'B', 'features' => array(1, 1)),
array('class' => 'B', 'features' => array(3, 2)),
);
$unknownSample = array('features' => array(2, 2));
$neighbors = findNeighbors($knownSamples, $unknownSample, 3);
$class = classify($neighbors);
echo "待分类样本的类别为:" . $class;
以上代码将输出待分类样本的类别。
总结:
本文介绍了如何用php实现k近邻算法。通过计算待分类样本与已知样本之间的距离,找到K个最近邻居,然后根据这些最近邻居的类别进行分类。K近邻算法是一种简单且常用的算法,适用于很多分类和回归问题。使用PHP实现K近邻算法相对简单,只需编写几个函数即可完成。希望本文能帮助读者理解和应用K近邻算法。











