Flume1.4学习问题记录-阿里云开发者社区

Flume1.4学习问题记录

2017-11-22 1656

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介：

1、red hat9 上安装 flume-1.4.0 ，配置好环境变量：
vi /etc/profile
FLUME_HOME=/usr/flume-1.4.0
PATH = $PATH:$FLUME_HOME/bin
export FLUME_HOME PATH
进入：flume-1.4.0目录
运行 bin/flume-ng agent -n a1 -c conf -f conf/agent1.conf -Dflume.root.logger=INFO, console
报bin/flume-ng有错,报错信息及解决方法如下：

1、82行
/usr/flume-1.4.0/bin/flume-ng: line 82: conditional binary operator expected
/usr/flume-1.4.0/bin/flume-ng: line 82: syntax error near =~'<br/>/usr/flume-1.4.0/bin/flume-ng: line 82: if [[ $line =~ ^java.library.path=(.)$ ]]; then'
解决：
将：if [[ $line =~ ^java.library.path=(.)$ ]]; then 修改为：if [[$line =~ "^java.library.path=(.)$" ]]; then
注： [[和$line之间不能有空格，^java.library.path=(.)$用 "" 引起来；

2、 102行
bin/flume-ng: line 102: syntax error near unexpected token ('<br/>bin/flume-ng: line 102: if [[$PIECE =~ slf4j-(api|log4j12)..jar ]]; then'
解决：
将if [[$PIECE =~ slf4j-(api|log4j12)..jar ]]; then修改为：if [[$PIECE =~ "slf4j-(api|log4j12)..jar" ]]; then
3、131行，和 82行一样
4、151行
bin/flume-ng: line 151: syntax error near unexpected token ('<br/>bin/flume-ng: line 151: if [[$PIECE =~ slf4j-(api|log4j12)..jar ]]; then'
解决：
将if [[$PIECE =~ slf4j-(api|log4j12)..jar ]]; then修改为：if [[$PIECE =~ "slf4j-(api|log4j12)..jar" ]]; then

5、运行flume-ng 后台出现信息：bin/flume-ng: line 102: [[/usr/hadoop-0.21.0: No such file or directory 可以忽略.

6、使用 agent1.sources.r1.type = netcat
报错：Caused by: java.net.BindException: Address already in use
说明配置文件中配的端口被占用，断开ssh，重新连接即可。

7、
错误记录：
org.apache.flume.ChannelException: Space for commit to queue couldn't be acquired Sinks are likely not keeping up with sources, or the buffer size is too tight

解决：设置agent1.channels.<channel_name>.keep-alive = 30

8、org.apache.flume.channel.file.BadCheckpointException: Configured capacity is 100000 but the checkpoint file capacity is 1000. See FileChannel documentation on how to

最近flume运行不是很稳定，本次由于hadoop不能写入，导致flume报错，Configured capacity is 100000000 but the checkpoint file capacity is 1000000，重启flume后问题仍然存在。
1，详细报错如下：
22 Jan 2013 11:07:42,568 INFO [pool-7-thread-1] (org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed:209) - Connection to /10.4.203.176:60322 disconnected.
22 Jan 2013 11:07:44,617 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.SinkRunner$PollingRunner.run:160) - Unable to deliver event. Exception follows.
java.lang.IllegalStateException: Channel closed [channel=file_chan_1]. Due to java.lang.IllegalStateException: Configured capacity is 100000000 but the checkpoint file capacity is 1000000. See FileChannel documentation on how to change a channels capacity.
at org.apache.flume.channel.file.FileChannel.createTransaction(FileChannel.java:321)
at org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicChannelSemantics.java:122)
at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:385)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.IllegalStateException: Configured capacity is 100000000 but the checkpoint file capacity is 1000000. See FileChannel documentation on how to change a channels capacity.
at org.apache.flume.channel.file.EventQueueBackingStoreFile.<init>(EventQueueBackingStoreFile.java:80)
at org.apache.flume.channel.file.EventQueueBackingStoreFileV3.<init>(EventQueueBackingStoreFileV3.java:42)
at org.apache.flume.channel.file.EventQueueBackingStoreFactory.get(EventQueueBackingStoreFactory.java:67)
at org.apache.flume.channel.file.EventQueueBackingStoreFactory.get(EventQueueBackingStoreFactory.java:36)
at org.apache.flume.channel.file.Log.replay(Log.java:339)
at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:271)
at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
... 1 more

2，故障原因：
FileChannel使用了固定大小的checkpoint file，修改channel的能力，简单的方法如下
1，关闭agent
2, 删除或者备份checkpoint目录，默认该目录在当前用户的根目录下，比如 root/.flume, 其中root为当前用户，.flume为checkpoint目录。
3，重启flume agent（如果channel中有大量文件，会导致全面的延迟，建议先关闭数据源，等把file channel中的数据全部写入sink后，等上2分钟，等数据文件删除后，再重启channel
具体详见参考资料这段话，这段意思未完全读明白，通过解决方法靠谱

The FileChannel actually uses a fixed size checkpoint file -- so it is not possible to set
it to unlimited size (the checkpoint file is mmap-ed to a fixed size buffer). To change the
capacity of the channel, the easiest way off the top of my head is:

Shutdown the agent.
Delete all files in the file channel's checkpoint directory. (not the data directories.
Also you might want to move them out, rather than delete to be safe)
Change your configuration to increase the capacity of the channel.
Restart the agent - this will cause full replay, so the agent might take sometime to start
up if there are a lot of events in the channel (to avoid this - shutdown the source before
shutting the agent down - so the sink can drain out the channel completely, wait for about
1-2 mins after the channel is empty so that the data files get deleted (this happens only
immediately after a checkpoint - you can verify this by making sure each data dir has only
2 files each), since all events have been sent out - so during restart the channel will be
quite empty, with very little to replay).
9、14/05/20 11:44:27 ERROR source.SpoolDirectorySource: Uncaught exception in Runnable
java.lang.IllegalStateException: Serializer has been closed
at org.apache.flume.serialization.LineDeserializer.ensureOpen(LineDeserializer.java:124)
at org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:88)
at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:221)
at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:160)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

看了一下spool的目录下的文件,发现有个文件已经处理了(后缀加上了.COMPLETED)了,然后目录下还有一个文件跟这个是同名的即,目录下存在:
123.log.COMPLETED和123.log两个文件,就会报上诉错误。

10、
flume ng报错File should not roll when commit is outstanding
执行flume ng,后台日志报错
java.lang.IllegalStateException: File should not roll when commit is outstanding.
at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:204)
at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:160)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

这是监控的目录(spooldir)里面有长度为0的文件,把他删了或者改名加后缀.COMPLETE,再重启flume(有时候直接重启也没问题了),貌似是一个bug来的https://issues.apache.org/jira/browse/FLUME-1934

本文转自于学康 51CTO博客，原文链接：http://blog.51cto.com/blxueyuan/2067694，如需转载请自行联系原作者

Flume1.4学习问题记录

热门文章

最新文章

相关课程

相关电子书

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

Flume1.4学习问题记录

热门文章

最新文章

相关课程

相关电子书