牵手丶若相惜 发表于 2019-10-8 19:56

mapreduce 单词统计怎么实现排序输出前5

文件:
REPORT ON THE WORK OF THE GOVERNMENT
Delivered at the Fifth Session of the 12th National People's Congress of the People's Republic of China on March 5, 2017
Li Keqiang
Premier of the State Council
Esteemed Deputies,
On behalf of the State Council, I will now report to you on the work of the government and ask for your deliberation and approval.
I also wish to have comments on my report from the members of the National Committee of the Chinese People's Political Consultative Conference (CPPCC).
Review of our work in 2016
In the past year, China's development has faced

需求:输出单词出现最多的前5
分隔符:空格

annybaby 发表于 2019-10-8 22:01

拆分,循环,统计,输出

吸水雨衣 发表于 2019-10-9 21:39

object Top3 {
def main(args: Array): Unit = {
    val conf = new SparkConf().setAppName("Top3Scala").setMaster("local")
    val context = new SparkContext(conf)
    val linesRDD = context.textFile("E:\\testdata\\wordcount\\input\\top.txt")

    val pairs = linesRDD.map(line => (line.toInt, line))
    val sort = pairs.sortByKey(false)
    val result = sort.map(sort => sort._2)
    val strings = result.take(3)
    for(string <- strings) println("string = " + string)
}
}

牵手丶若相惜 发表于 2019-10-10 15:28

吸水雨衣 发表于 2019-10-9 21:39
object Top3 {
def main(args: Array): Unit = {
    val conf = new ...

有没有Java的代码
页: [1]
查看完整版本: mapreduce 单词统计怎么实现排序输出前5