一、知识框架
整体架构
后续会和solr集群一起。
HA
还是看官网吧~
http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html
HDFS Federation
官网官网:
http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-hdfs/Federation.html
二、Federation+HA集群环境搭建
基础运行设置
用户与组
- adduser hadoop#添加hadoop用户
- passwd hadoop#设置hadoop用户的口令
- groupadd hadoop#增加hadoop组
- usermod -G hadoop hadoop#将hadoop用户添加到hadoop组下
配置host
有统一的DNS服务器或是由sa配置好,此处可跳过,以下实例是个人之前配置的,只是为了说明实现步骤,以供参考,公司这些已有sa配置完毕。
- vi /etc/hosts#配置hosts文件
- vi /etc/sysconfig/network#修改network配置
- reboot#重启
重启登陆后主机名改变:
配置ssh
- ssh-keygen -t rsa#生成ssh公钥和秘钥,默认存放位置:(/home/hadoop/.ssh/id_rsa)
- cd /home/hadoop/.ssh/id_rsa #切换公钥和秘钥存放位置
ll#查看公钥和秘钥文件
- chmod 644 authorized_keys#修改authorized_keys文件权限,使它对hadoop用户具备可写权限。
- cp id_rsa.pub authorized_keys#将公钥文件内容写到authorized_keys(或者执行 cat id_rsa.pub >> authorized_keys)
- 集群内的所有节点都执行此过程,并将authorized_keys文件内容整合为一个大文件,并在每个节点都进行存储,使集群中每个节点相互之间都知道对方的公钥,即可实现免校验登陆。
每个节点的情况:
安装JDK
服务器上有JDK,但是为独立安装了一个,具体过程省略。
安装Hadoop
规范:所有节点的目录说明
安装目录:
/home/deploy/hadoop
--| hadoop-2.0.6-solr
--| jdk
数据目录:
/data/hadoop/
--| dfs
--| data #datanode数据目录
--| journal #NN共享数据目录
--| name #NN数据目录
--| temp #hadoop临时目录
通过官网下载,当前环境版本为2.0.6。
获取hadoop
- scp 传输,具体根据自己的环境而定。
- Hadoop环境变量配置
配置系统环境
PATH=$PATH:$HOME/bin export HADOOP_HOME=/home/deploy/hadoop/hadoop-2.0.6-solr #export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib" export PATH=$PATH:$HADOOP_HOME/sbin export PATH=$PATH:$HADOOP_HOME/bin export HADOOP_MAPARED_HOME=${HADOOP_HOME} export HADOOP_COMMON_HOME=${HADOOP_HOME} export HADOOP_HDFS_HOME=${HADOOP_HOME} export YARN_HOME=${HADOOP_HOME} export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop export JAVA_HOME=/home/deploy/hadoop/jdk/jdk1.7.0_51 export PATH=$JAVA_HOME/bin:$PATH export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH |
我并没有把Hadoop加入到path下,想要加就在path后面加,我还是习惯切换到$HADOOP_HOME下操作。
Hadoop配置
切换到$HADOOP_HOME$/etc/hadoop,这是hadoop的配置文件存放的路径,与1.2.1版本的路径不一样了。
- vi hadoop-env.sh #hadoop环境变量设置,设置JAVA_HOME
# The java implementation to use. export JAVA_HOME=/home/deploy/hadoop/jdk/jdk1.7.0_51
# The jsvc implementation to use. Jsvc is required to run secure datanodes. #export JSVC_HOME=${JSVC_HOME} export HADOOP_COMMON_LIB_NATIVE_DIR=/home/deploy/hadoop/hadoop-2.0.6-solr/lib/native export HADOOP_OPTS="-Djava.library.path=/home/deploy/hadoop/hadoop-2.0.6-solr/lib" export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
# Extra Java CLASSPATH elements. Automatically insert capacity-scheduler. for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do if [ "$HADOOP_CLASSPATH" ]; then export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f else export HADOOP_CLASSPATH=$f fi done
# The maximum amount of heap to use, in MB. Default is 1000. #export HADOOP_HEAPSIZE= #export HADOOP_NAMENODE_INIT_HEAPSIZE=""
# Extra Java runtime options. Empty by default. export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true $HADOOP_CLIENT_OPTS"
# Command specific options appended to HADOOP_OPTS when specified export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS" export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"
export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"
# The following applies to multiple commands (fs, dfs, fsck, distcp etc) export HADOOP_CLIENT_OPTS="-Xmx128m $HADOOP_CLIENT_OPTS" #HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData $HADOOP_JAVA_PLATFORM_OPTS"
# On secure datanodes, user to run the datanode as after dropping privileges export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}
# Where log files are stored. $HADOOP_HOME/logs by default. #export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER
# Where log files are stored in the secure data environment. export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}
# The directory where pid files are stored. /tmp by default. # NOTE: this should be set to a directory that can only be written to by # the user that will run the hadoop daemons. Otherwise there is the # potential for a symlink attack. export HADOOP_PID_DIR=${HADOOP_PID_DIR} export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}
# A string representing this instance of hadoop. $USER by default. export HADOOP_IDENT_STRING=$USER |
JVM的参数还是默认的还没有做调整。
- vi core-site.xml #配置core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>viewfs:///</value> </property>
<property> <name>fs.viewfs.mounttable.default.link./user</name> <value>hdfs://ns1/user</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/data/hadoop/temp</value> <description>Abase for other temporary directories.</description> </property>
<property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/data/hadoop/temp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>hadoop.proxyuser.hduser.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hduser.groups</name> <value>*</value> </property> <property> <name>hadoop.native.lib</name> <value>false</value> <description>Should native hadoop libraries, if present, be used.</description> </property> <property> <name>ha.zookeeper.quorum</name> <value>server-zk-001.m6:2184,server-zk-002.m6:2184,server-zk-003.m6:2184</value> <description>指定用于HA的ZooKeeper集群机器列表</description> </property> <property> <name>ha.zookeeper.session-timeout.ms</name> <value>5000</value> <description>指定ZooKeeper超时间隔,单位毫秒</description> </property> </configuration> |
- vi hdfs-site.xml #编辑hdfs-site.xml
<configuration> <property> <name>dfs.nameservices</name> <value>ns1</value> </property>
<property> <name>dfs.ha.namenodes.ns1</name> <value>nn1,nn2</value> </property>
<property> <name>dfs.namenode.rpc-address.ns1.nn1</name> <value>server-solr-001.m6.server.com:9000</value> </property>
<property> <name>dfs.namenode.http-address.ns1.nn1</name> <value>server-solr-001.m6.server.com:50070</value> </property>
<property> <name>dfs.namenode.rpc-address.ns1.nn2</name> <value>server-zk-002.m6.server.com:9000</value> </property>
<property> <name>dfs.namenode.http-address.ns1.nn2</name> <value>server-zk-002.m6.server.com:50070</value> </property> <property> <name>dfs.client.failover.proxy.provider.ns1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property>
<property> <name>dfs.namenode.name.dir</name> <value>file:/data/hadoop/dfs/name</value> </property>
<property> <name>dfs.namenode.shared.edits.dir.ns1.nn1</name> <value>qjournal://server-solr-001.m6.server.com:8485;server-solr-002.m6.server.com:8485;server-solr-003.m6.server.com:8485/ns1</value> </property>
<property> <name>dfs.namenode.shared.edits.dir.ns1.nn2</name> <value>qjournal://server-solr-001.m6.server.com:8485;server-solr-002.m6.server.com:8485;server-solr-003.m6.server.com:8485/ns1</value> </property>
<property> <name>dfs.datanode.data.dir</name> <value>file:/data/hadoop/dfs/data</value> </property>
<property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property>
<property> <name>dfs.journalnode.edits.dir</name> <value>/data/hadoop/dfs/journal</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property>
<property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> <!-- <property> <name>dfs.hosts.exclude</name> <value>/home/deploy/hadoop/hadoop-2.0.6-solr/etc/hadoop/dfs-hosts-exclude</value> </property> <property> <name>dfs.hosts</name> <value>/home/deploy/hadoop/hadoop-2.0.6-solr/etc/hadoop/dfs-hosts</value> </property> --> </configuration> |
- vi slaves #修改slaves文件
server-solr-001.m6.server.com server-solr-002.m6.server.com server-solr-003.m6.server.com |
创建配置文件中的目录
- mkdir -p /data/hadoop/dfs/name
- mkdir -p /data/hadoop/dfs/data
- mkdir -p /data/hadoop/dfs/journal
- mkdir -p /data/hadoop/temp
创建配置文件中的目录
将配置好的hadoop目录向各个节点分发,过程略。
启动Hadoop集群
启动journalnode(NN间的共享数据)
- 在HADOOP_HOME下,执行以下脚本,之后脚本的执行都在HADOOP_HOME下
./sbin/hadoop-daemons.sh start journalnode |
journal三个节点都可以通过如下验证:
格式化NN
- 执行以下脚本
./bin/hdfs namenode -format -clusterId hadoop-solr-cluster |
日志信息如下:
成功。
启动
- nn1启动namenode
./sbin/hadoop-daemon.sh start namenode |
- nn2 standby
在NN2的机器上执行:
./bin/hdfs namenode -bootstrapStandby |
- nn2启动namenode
在NN2的机器上执行:
./sbin/hadoop-daemon.sh start namenode |
此时nn1和nn2二者都是standby状态。
- 格式化ZK
./bin/hdfs zkfc -formatZK |
日志如下:
查看zoonkeeper内容如下:
ns1的信息已经在zookeeper中了。
- nn1和nn2启动zkfc
./sbin/hadoop-daemon.sh start zkfc |
- 获取nn1状态为active
./bin/hdfs haadmin -getServiceState nn1 |
这是通过zookeeper实现active和standby两个NN间的自动切换,若不使用zkfc,二者需要手动切换。命令如下:
$bin/hdfs haadmin -ns hadoop-clusterId -transitionToActive nn1 |
- 启动datanode
在active的namenode上执行
./sbin/hadoop-daemons.sh start datanode |
验证集群启动结果
- 进程查看
001
002
003
- web查看
挂SSH代理后,浏览器输入:http://localhost:50070/dfshealth.jsp