Hadoop 自身的序列化存储格式就是实现了 **Writable 接口**的类,Writable 接口定义了两个方法:
(1)使用 `write(DataOutput out)` 方法将数据写入到二进制数据流中
(2)使用 `readFields(DataInput in)` 方法从二进制数据流中读取数据
以流量统计项目案例为例:
(1)数据样例
13726238888248124681135604366661116954137262305032481246811382654410126401392643565613215121392625110624001821157596115272106
(2)字段释义
|字段中文释义|字段英文释义|数据类型||------------|------------|--------||手机号|phone|String||上行流量|upflow|Long||下行流量|downflow|Long|
(3)项目需求一
统计每一个用户(手机号)所耗费的总上行流量、总下行流量、总流量。
期望输出数据格式:
13480253104249480024948004989600
下面是进行了序列化和反序列化的 FlowBean 类:
packagecom.xiaowang.sum; importjava.io.DataInput; importjava.io.DataOutput; importjava.io.IOException; importorg.apache.hadoop.io.Writable; publicclassFlowBeanimplementsWritable { privatelongupFlow; privatelongdownFlow; privatelongsumFlow; // 序列化框架在反序列化操作创建对象实例时会调用无参构造器publicFlowBean() { super(); } // 为了对象数据的初始化方便,加入一个带参的构造函数publicFlowBean(longupFlow, longdownFlow) { super(); this.upFlow=upFlow; this.downFlow=downFlow; this.sumFlow=upFlow+downFlow; } publiclonggetUpFlow() { returnupFlow; } publicvoidsetUpFlow(longupFlow) { this.upFlow=upFlow; } publiclonggetDownFlow() { returndownFlow; } publicvoidsetDownFlow(longdownFlow) { this.downFlow=downFlow; } publiclonggetSumFlow() { returnsumFlow; } publicvoidsetSumFlow(longsumFlow) { this.sumFlow=sumFlow; } // 序列化方法publicvoidwrite(DataOutputout) throwsIOException { out.writeLong(upFlow); out.writeLong(downFlow); out.writeLong(sumFlow); } // 反序列化方法// 注意:字段的反序列化顺序与序列化时的顺序保持一致,并且参数类型和个数也一致publicvoidreadFields(DataInputin) throwsIOException { this.upFlow=in.readLong(); this.downFlow=in.readLong(); this.sumFlow=in.readLong(); } publicStringtoString() { returnupFlow+"\t"+downFlow+"\t"+sumFlow; } }