使用 ArrayWritable 的序列化似乎以一种有趣的方式工作

Serialization using ArrayWritable seems to work in a funny way(使用 ArrayWritable 的序列化似乎以一种有趣的方式工作)
本文介绍了使用 ArrayWritable 的序列化似乎以一种有趣的方式工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

限时送ChatGPT账号..

我正在使用 ArrayWritable,有时我需要检查 Hadoop 如何序列化 ArrayWritable,这就是我通过设置 job.setNumReduceTasks(0):

I was working with ArrayWritable, at some point I needed to check how Hadoop serializes the ArrayWritable, this is what I got by setting job.setNumReduceTasks(0):

0  IntArrayWritable@10f11b8
3  IntArrayWritable@544ec1
6  IntArrayWritable@fe748f
8  IntArrayWritable@1968e23
11  IntArrayWritable@14da8f4
14  IntArrayWritable@18f6235

这是我使用的测试映射器:

and this is the test mapper that I was using:

public static class MyMapper extends Mapper<LongWritable, Text, LongWritable, IntArrayWritable> {

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        int red = Integer.parseInt(value.toString());
        IntWritable[] a = new IntWritable[100];

        for (int i =0;i<a.length;i++){
            a[i] = new IntWritable(red+i);
        }

        IntArrayWritable aw = new IntArrayWritable();
        aw.set(a);
        context.write(key, aw);
    }
}

IntArrayWritable 取自 javadoc 中给出的示例:ArrayWritable.

IntArrayWritable is taken from the example given in the javadoc: ArrayWritable.

import org.apache.hadoop.io.ArrayWritable;
import org.apache.hadoop.io.IntWritable;

public class IntArrayWritable extends ArrayWritable {
    public IntArrayWritable() {
        super(IntWritable.class);
    }
}

我实际上检查了 Hadoop 的源代码,这对我来说毫无意义.ArrayWritable 不应序列化类名,并且无法使用 6/7 十六进制值序列化 100 个 IntWritable 的数组.该应用程序实际上似乎工作得很好,并且reducer反序列化了正确的值......怎么了?我错过了什么?

I actually checked on the source code of Hadoop and this makes no sense to me. ArrayWritable should not serialize the class name and there is no way that an array of 100 IntWritable can be serialized using 6/7 hexadecimal values. The application actually seems to work just fine and the reducer deserializes the right values... What is happening? What am I missing?

推荐答案

问题是您从 MapReduce 作业获得的输出不是该数据的序列化版本.它被翻译成漂亮的打印字符串.

The problem is that the output you are getting from your MapReduce job is not the serialized version of that data. It is something that is translated into a pretty printed string.

当您将 reducer 的数量设置为零时,您的映射器现在通过输出格式传递,该格式将格式化您的数据,可能会将其转换为可读的字符串.它不会像要被减速器拾取一样将其序列化.

When you set the number of reducers to zero, your mappers now get passed through a output format, which will format your data, likely converting it to a readable string. It does not dump it out serialized as if it was going to be picked up by a reducer.

这篇关于使用 ArrayWritable 的序列化似乎以一种有趣的方式工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

相关文档推荐

Sending a keyboard event from java to any application (on-screen-keyboard)(将键盘事件从 java 发送到任何应用程序(屏幕键盘))
How to make JComboBox selected item not changed when scrolling through its popuplist using keyboard(使用键盘滚动其弹出列表时如何使 JComboBox 所选项目不更改)
Capturing keystrokes without focus(在没有焦点的情况下捕获击键)
How can I position a layout right above the android on-screen keyboard?(如何将布局放置在 android 屏幕键盘的正上方?)
How to check for key being held down on startup in Java(如何检查在Java中启动时按住的键)
Android - Get keyboard key press(Android - 获取键盘按键)