【笔记】 Hadoop学习笔记MapReduce数学建模流量统计
数据原表:##数据源来自全国数学建模竞赛专科组E 赛题
需求:统计各个水表用量
需求分析:Map:k:水表名,v:用量,Reduce:对v求和。
编写代码:
public class WcMapper extends Mapper<LongWritable,Text,Text, DoubleWritable> {
privateText k=new Text();
private DoubleWritable v =new DoubleWritable();
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
// String lin=value.toString();// 指定编码格式,不然会乱码
String lin = new String(value.getBytes(),0,value.getLength(),"GBK");
String[] words= lin.split(",");
String name=words[0];
Double flow= Double.parseDouble(wordslength-1]);
k.set(name);
v.set(flow);
context.write(k,v);
}
}
这里直接copy Wordcount的reduce
public class WcReducer extends Reducer<Text, DoubleWritable,Text,DoubleWritable> {
double sum ;
DoubleWritable v=new DoubleWritable();
@Override
protected void reduce(Text key, Iterable<DoubleWritable> values, Context context) throws IOException, InterruptedException {
sum =0;
for (DoubleWritable count:values)
{
sum +=count.get();
}
v.set(sum);
context.write(key,v);
}
}
public class WcDriver {
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
// 1 获取配置信息以及封装任务
Configuration configuration = new Configuration();
Job job = Job.getInstance(configuration);
// 2 设置jar加载路径
job.setJarByClass(WcDriver.class);
// 3 设置map和reduce类
job.setMapperClass(WcMapper.class);
job.setReducerClass(WcReducer.class);
// 4 设置map输出
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(DoubleWritable.class);
// 5 设置最终输出kv类型
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(DoubleWritable.class);
// 6 设置输入和输出路径
FileInputFormat.setInputPaths(job,new Path(args[0]));
FileOutputFormat.setOutputPath(job,new Path(args[1]));
// 7 提交
boolean result = job.waitForCompletion(true);
System.exit(result ? 0 : 1);
}
}
打包上传Hadoop集群运行
查看统计结果:
成功{:301_987:}
这个代码是对wordcound的修改实现,第一次尝试自己编写MapReduce模型。
页:
[1]