php 过滤中英文以外,PHP-php过滤重复中英文字符串
個人覺得就是2點優化
1、分塊,分塊以后遍歷不用每次從頭開始定位
2、快速定位,可以將字符驗證改成整數驗證
$str = "中國人abc蓅氓澮娬朮國人ac娬朮";
$str = str_pad('', 1024*1024, $str);
mb_internal_encoding("UTF-8");
$time = time();
$len = mb_strlen($str);
// 按照千字一組
$group_size = 500;
$group_total = ceil($len / $group_size);
$chars = array();
$result = '';
for($i = 0; $i < $group_total; $i++) {
$tmp = mb_substr($str, 0, $group_size);
// 這里如果處理1組字符了,就將前1組理掉
$str = mb_substr($str, $group_size, $len > $group_size ? $len - $group_size : $len);
$len = mb_strlen($str);
if($i % 50 == 49) {
printf("process %d groups, total %d groups, run time %dn", $i + 1, $group_total, time()-$time);
}
// 處理字符
$tmp_len = $i < $group_total - 2 ? $group_size : mb_strlen($tmp);
for($j = 0; $j < $tmp_len; $j++) {
$char = mb_substr($str, $i, 1);
$num = hexdec(bin2hex($char));
if(isset($chars[$num])) {
continue;
} else {
$chars[$num] = 1;
$result .= $char;
}
}
}
var_dump($result);
輸出
process 50 groups, total 870 groups, run time 4
process 100 groups, total 870 groups, run time 9
process 150 groups, total 870 groups, run time 13
process 200 groups, total 870 groups, run time 18
process 250 groups, total 870 groups, run time 22
process 300 groups, total 870 groups, run time 27
process 350 groups, total 870 groups, run time 31
process 400 groups, total 870 groups, run time 35
process 450 groups, total 870 groups, run time 40
process 500 groups, total 870 groups, run time 44
process 550 groups, total 870 groups, run time 48
process 600 groups, total 870 groups, run time 52
process 650 groups, total 870 groups, run time 56
process 700 groups, total 870 groups, run time 60
process 750 groups, total 870 groups, run time 65
process 800 groups, total 870 groups, run time 69
process 850 groups, total 870 groups, run time 73
string(27) "氓娬蓅cab人國朮中澮"
總結
以上是生活随笔為你收集整理的php 过滤中英文以外,PHP-php过滤重复中英文字符串的全部內容,希望文章能夠幫你解決所遇到的問題。
 
                            
                        - 上一篇: php gtk中文介绍,PHP-GTK介
- 下一篇: matlab -1,matlab(1)
