SMALL
Linux에서 NTP(Network Time Protocol) 서비스가 실행 되면서 시스템 시간을 주기적으로 동기화 합니다.
NTP(ntpd 데몬)가 실행중이더라도 외부의 TimeServer에 접속이 되질 않는다면, 어느정도 시기가 지나면 서버의 시간이 느려지는 증상이 있습니다.

HDFS 컨테이너에서 요청 허용 시간의 범위를 체크하는데, 시간이 지나면서 각 서버의 시간차가 커지다 보면 작업을 실패합니다.


관련 오류
2016-07-22 17:04:44,437 ERROR [Thread-712]: SessionState (SessionState.java:printError(833)) - Status: Failed
2016-07-22 17:04:44,438 ERROR [Thread-712]: SessionState (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 6, vertexId=vertex_1468916542684_0021_1_00, diagnostics=[Task failed, taskId=task_1468916542684_0021_1_00_000007, diagnostics=[TaskAttempt 0 failed, info=[Container launch failed for container_e54_1468916542684_0021_02_000002 : org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
This token is expired. current time is 1469175137466 found 1469175128093
Note: System times on machines may be out of sync. Check system time and time zones.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:168)
at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:380)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 1 failed, info=[Container launch failed for container_e54_1468916542684_0021_02_000015 : org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
This token is expired. current time is 1469175138445 found 1469175128918
Note: System times on machines may be out of sync. Check system time and time zones.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:168)
at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:380)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 2 failed, info=[Container launch failed for container_e54_1468916542684_0021_02_000017 : org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
This token is expired. current time is 1469175139258 found 1469175129970
Note: System times on machines may be out of sync. Check system time and time zones.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:168)
at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:380)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 3 failed, info=[Container launch failed for container_e54_1468916542684_0021_02_000020 : org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
This token is expired. current time is 1469175140315 found 1469175131018
Note: System times on machines may be out of sync. Check system time and time zones.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:168)
at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:380)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1468916542684_0021_1_00 [Map 6] killed/failed due to:null]
2016-07-22 17:04:44,438 ERROR [Thread-712]: SessionState (SessionState.java:printError(833)) - Vertex killed, vertexName=Reducer 3, vertexId=vertex_1468916542684_0021_1_04, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1468916542684_0021_1_04 [Reducer 3] killed/failed due to:null]
2016-07-22 17:04:44,439 ERROR [Thread-712]: SessionState (SessionState.java:printError(833)) - DAG failed due to vertex failure. failedVertices:1 killedVertices:5


각 서버의 시간 정보를 확인후에 맞지 않을 경우 변경
 

# 시스템 현재 시간으로 수동 설정

> System Time 확인
[root@hdfs ~]# date
2016. 07. 25. (월) 21:57:52 KST
 
> 시분초 설정 (24시간제로 입력)
[root@hdfs ~]# date -s 23:43:21
 
> 연월일 시분초 바꾸기
[root@hdfs ~]# date -s '2016-7-26 11:21:21'


네트워크 내부에 NTP Server가 셋팅 되어 있다면, 그리고 HDFS 클러스터의 각 서버가 접근 가능하다면,
아래의 TimeServer(time.bora.net) 설정과 같이 내부 NTP 서버로 등록 가능.

# NTP 정보 확인

> ntpd 서비스 구동 여부 확인
[root@hdfs ~]# service ntpd status
ntpd가 정지되었습니다.
 
> ntpstat 명령을 사용하여 NTP 서비스의 상태를 확인
[root@hdfs ~]# ntpstat
synchronised to NTP server (211.233.40.78) at stratum 3
   time correct to within 163 ms
   polling server every 1024 s
 
> 작동 되지 않는 상태
[root@www ~]# ntpstat
Unable to talk to NTP daemon. Is it running?
 
> Time Server의 시간 조회
[root@hdfs ~]# rdate -p time.bora.net
rdate: [time.bora.net]  Mon Jul 25 21:10:57 2016
 

# 시스템 시간 동기화

> Time Server의 시간 동기화 설정(일시적)
[root@hdfs ~]# rdate -s time.bora.net
[root@hdfs ~]# date
2016. 07. 25. (월) 21:44:19 KST
 
> Time Server 시간 주기적 동기화 ( 매일 24시에 동기화 )
[root@hdfs ~]# crontab -e
0 0 * * * rdate -s time.bora.net


LIST

'빅데이터(bigData)' 카테고리의 다른 글

HDFS - Python Encoding 오류 처리  (0) 2018.09.20
블로그 이미지

SeoHW

,