数据库

 首页 > 数据库 > MySql > Hadoop2.3.0详细安装过程

Hadoop2.3.0详细安装过程

分享到:
【字体:
导读:
         摘要:前言:       Hadoop实现了一个分布式文件系统(Hadoop Distributed File System),简称HDFS。HDFS有高容错性的特点,并且设计用来部署在低廉的(low-cost)硬件上;而且它提供高吞吐量(high throughput)来访...

Hadoop2.3.0详细安装过程

前言: 
      Hadoop实现了一个分布式文件系统(Hadoop Distributed File System),简称HDFS。HDFS有高容错性的特点,并且设计用来部署在低廉的(low-cost)硬件上;而且它提供高吞吐量(high throughput)来访问应用程序的数据,适合那些有着超大数据集(large data set)的应用程序。HDFS放宽了(relax)POSIX的要求,可以以流的形式访问(streaming access)文件系统中的数据。

      Hadoop的框架最核心的设计就是:HDFS和MapReduce.HDFS为海量的数据提供了存储,则MapReduce为海量的数据提供了计算。

1,系统架构
集群角色:
主机名     ip地址              角色
name01   192.168.52.128     NameNode、ResourceManager(JobTracker)
data01   192.168.52.129    NameNode、ResourceManager(JobTracker)
data02   192.168.52.130     DataNode、NodeManager(TaskTracker)

系统环境:
centos6.5 x64 vmware vpc
硬盘:30G
内存:1G

hadoop版本:hadoop-2.3.0

2,环境准备
2.1 系统设置
关闭iptables:
                /sbin/service iptables stop

              /sbin/chkconfig iptables off
关闭selinux: setenforce 0
              sed "s@^SELINUX=enforcing@SELINUX=disabled@g" /etc/sysconfig/selinux

设置节点名称,
所有节点执行:
/bin/cat < /etc/hosts
localhost.localdomain=data01 #或者name01,data02
192.168.52.128    name01  
192.168.52.129    data01 
192.168.52.130    data02  
EOF

hostname node0*
send "s@HOSTNAME=localhost.localdomain@HOSTNAME=node0*@g" /etc/sysconfig/network

2.2 用户目录创建 
创建hadoop运行账户:
使用root登陆所有机器后,所有的机器都创建hadoop用户
useradd hadoop #设置hadoop用户组
passwd hadoop
#sudo useradd –s /bin/bash –d /home/hadoop –m hadoop –g hadoop –G admin   //添加一个zhm用户,此用户属于hadoop用户组,且具有admin权限。
#su hadoop   //切换到zhm用户中

创建hadoop相关目录:
定义需要数据及目录的存放路径,定义代码及工具存放的路径
mkdir -p /home/hadoop/src
mkdir -p /home/hadoop/tools
chown -R hadoop.hadoop /home/hadoop/*

定义数据节点存放的路径到跟目录下的hadoop文件夹, 这里是数据节点存放目录需要有足够的空间存放
mkdir -p /data/hadoop/hdfs
mkdir -p /data/hadoop/tmp
mkdir -p /var/logs/hadoop

设置可写权限
chmod -R 777 /data/hadoop
chown -R hadoop.hadoop /data/hadoop/*
chown -R hadoop.hadoop /var/logs/hadoop

定义java安装程序路径
mkdir -p /usr/lib/jvm/ 

2.3 配置ssh免密码登陆
参考文章地址:http://blog.csdn.net/ab198604/article/details/8250461
SSH主要通过RSA算法来产生公钥与私钥,在数据传输过程中对数据进行加密来保障数
据的安全性和可靠性,公钥部分是公共部分,网络上任一结点均可以访问,私钥主要用于对数据进行加密,以防他人盗取数据。总而言之,这是一种非对称算法,
想要破解还是非常有难度的。Hadoop集群的各个结点之间需要进行数据的访问,被访问的结点对于访问用户结点的可靠性必须进行验证,hadoop采用的是ssh的方
法通过密钥验证及数据加解密的方式进行远程安全登录操作,当然,如果hadoop对每个结点的访问均需要进行验证,其效率将会大大降低,所以才需要配置SSH免
密码的方法直接远程连入被访问结点,这样将大大提高访问效率。
namenode节点配置免密码登陆其他节点,每个节点都要产生公钥密码,Id_dsa.pub为公钥,id_dsa为私钥,紧接着将公钥文件复制成authorized_keys文件,这个步骤是必须的,过程如下:

2.3.1 每个节点分别产生密钥
# 提示:
(1):.ssh目录需要755权限,authorized_keys需要644权限;
(2):Linux防火墙开着,hadoop需要开的端口需要添加,或者关掉防火墙;
(3):数据节点连不上主服务器还有可能是使用了机器名的缘故,还是使用IP地址比较稳妥。

name01(192.168.52.128)主库上面:
namenode主节点hadoop账户创建服务器登陆公私钥:
mkdir -p /home/hadoop/.ssh
chown hadoop.hadoop -R /home/hadoop/.ssh
chmod 755 /home/hadoop/.ssh 
su - hadoop
cd /home/hadoop/.ssh
ssh-keygen -t dsa -P '' -f id_dsa
[hadoop@name01 .ssh]$ ssh-keygen -t dsa -P '' -f id_dsa
Generating public/private dsa key pair.
open id_dsa failed: Permission denied.
Saving the key failed: id_dsa.
[hadoop@name01 .ssh]$
报错,解决办法是: setenforce 0
[root@name01 .ssh]# setenforce 0
su - hadoop 
[hadoop@name01 .ssh]$ ssh-keygen -t dsa -P '' -f id_dsa
Generating public/private dsa key pair.
Your identification has been saved in id_dsa.
Your public key has been saved in id_dsa.pub.
The key fingerprint is:
52:69:9a:ff:07:f4:fc:28:1e:48:18:fe:93:ca:ff:1d hadoop@name01
The key's randomart image is:
+--[ DSA 1024]----+
|                 |
|         .       |
|      . +        |
|     . B  .      |
|      * S. o     |
|       = o. o    |
|        * ..Eo   |
|     . . o.oo..  |
|      o..o+o.    |
+-----------------+
[hadoop@name01 .ssh]$ ll
total 12
-rw-------. 1 hadoop hadoop  668 Aug 20 23:58 id_dsa
-rw-r--r--. 1 hadoop hadoop  603 Aug 20 23:58 id_dsa.pub
drwxrwxr-x. 2 hadoop hadoop 4096 Aug 20 23:48 touch
[hadoop@name01 .ssh]$ 
Id_dsa.pub为公钥,id_dsa为私钥,紧接着将公钥文件复制成authorized_keys文件,这个步骤是必须的,过程如下:
[hadoop@name01 .ssh]$ cat id_dsa.pub >> authorized_keys
[hadoop@name01 .ssh]$ ll
total 16
-rw-rw-r--. 1 hadoop hadoop  603 Aug 21 00:00 authorized_keys
-rw-------. 1 hadoop hadoop  668 Aug 20 23:58 id_dsa
-rw-r--r--. 1 hadoop hadoop  603 Aug 20 23:58 id_dsa.pub
drwxrwxr-x. 2 hadoop hadoop 4096 Aug 20 23:48 touch
[hadoop@name01 .ssh]$
用上述同样的方法在剩下的两个结点中如法炮制即可。

data01(192.168.52.129)
2.3.2 在data01(192.168.52.129)上面执行:
useradd hadoop #设置hadoop用户组
passwd hadoop #设置hadoop密码为hadoop
setenforce 0
su - hadoop 
mkdir -p /home/hadoop/.ssh
cd /home/hadoop/.ssh
ssh-keygen -t dsa -P '' -f id_dsa
cat id_dsa.pub >> authorized_keys

2.3.3 在data01(192.168.52.130)上面执行:
useradd hadoop #设置hadoop用户组
passwd hadoop #设置hadoop密码为hadoop
setenforce 0
su - hadoop 
mkdir -p /home/hadoop/.ssh
cd /home/hadoop/.ssh
ssh-keygen -t dsa -P '' -f id_dsa
cat id_dsa.pub >> authorized_keys

2.3.4 构造3个通用的authorized_keys
在name01(192.168.52.128)上操作:
su - hadoop
cd /home/hadoop/.ssh
scp hadoop@data01:/home/hadoop/.ssh/id_dsa.pub ./id_dsa.pub.data01
scp hadoop@data02:/home/hadoop/.ssh/id_dsa.pub ./id_dsa.pub.data02
cat id_dsa.pub.data01 >> authorized_keys
cat id_dsa.pub.data02 >> authorized_keys

如下所示:
[hadoop@name01 .ssh]$ scp hadoop@data01:/home/hadoop/.ssh/id_dsa.pub ./id_dsa.pub.data01
The authenticity of host 'data01 (192.168.52.129)' can't be established.
RSA key fingerprint is 5b:22:7b:dc:0c:b8:bf:5c:92:aa:ff:93:3c:59:bd:d3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'data01,192.168.52.129' (RSA) to the list of known hosts.
hadoop@data01's password: 
Permission denied, please try again.
hadoop@data01's password: 
id_dsa.pub                                                                                                                                                   100%  603     0.6KB/s   00:00    
[hadoop@name01 .ssh]$
[hadoop@name01 .ssh]$ scp hadoop@data02:/home/hadoop/.ssh/id_dsa.pub ./id_dsa.pub.data02
The authenticity of host 'data02 (192.168.52.130)' can't be established.
RSA key fingerprint is 5b:22:7b:dc:0c:b8:bf:5c:92:aa:ff:93:3c:59:bd:d3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'data02,192.168.52.130' (RSA) to the list of known hosts.
hadoop@data02's password: 
id_dsa.pub                                                                                                                                                   100%  603     0.6KB/s   00:00    
[hadoop@name01 .ssh]$
[hadoop@name01 .ssh]$ cat id_dsa.pub.data01 >> authorized_keys
[hadoop@name01 .ssh]$ cat id_dsa.pub.data02 >> authorized_keys
[hadoop@name01 .ssh]$ cat authorized_keys
ssh-dss AAAAB3NzaC1kc3MAAACBAI2jwEdOWNFFcpys/qB4OercYLY5o5XvBn8a5iy9K/WqYcaz35SimzxQxGtVxWq6AKoKaO0nfjE3m1muTP0grVd5i+HLzysRcpomdFc6z2PXnh4b8pA4QbFyYjxEAp5HszypYChEGGEgpBKoeOei5aA1+ufF1S6b8yEozskITUi7AAAAFQDff2ntRh50nYROstA6eV3IN2ql9QAAAIAtOFp2bEt9QGvWkkeiUV3dcdGd5WWYSHYP0jUULU4Wz0gvCmbpL6uyEDCAiF88GBNKbtDJKE0QN1/U9NtxL3RpO7brTOV7fH0insZ90cnDed6qmZTK4zXITlPACLafPzVB2y/ltH3z0gtctQWTydn0IzppS2U5oe39hWDmBBcYEwAAAIBbV8VEEwx9AUrv8ltbcZ3eUUScFansiNch9QNKZ0LeUEd4pjkvMAbuEAcJpdSqhgLBHsQxpxo3jXpM17vy+AiCds+bINggkvayE6ixRTBvQMcY6j1Bu7tyRmsGlC998HYBXbv/XyC9slCmzbPhvvTk4tAwHvlLkozP3sWt0lDtsw== hadoop@name01
ssh-dss AAAAB3NzaC1kc3MAAACBAJsVCOGZbKkL5gRMapCObhd1ndv1UHUCp3ZC89BGQEHJPKOz8DRM9wQYFLK7pWeCzr4Vt5ne8iNBVJ6LdXFt703b6dYZqp5zpV41R0wdh2wBAhfjO/FI8wUskAGDpnuqer+5XvbDFZgbkVlI/hdrOpKHoekY7hzX2lPO5gFNeU/dAAAAFQDhSINPQqNMjSnyZm5Zrx66+OEgKwAAAIBPQb5qza7EKbGnOL3QuP/ozLX73/7R6kxtrgfskqb8ejegJbeKXs4cZTdlhNfIeBew1wKQaASiklQRqYjYQJV5x5MaPHTvVwoWuSck/3oRdmvKVKBASElhTiiGLQL3Szor+eTbLU76xS+ydILwbeVh/MGyDfXdXRXfRFzSsOvCsAAAAIAeCGgfT8xjAO2M+VIRTbTA51ml1TqLRHjHoBYZmg65oz1/rnYfReeM0OidMcN0yEjUcuc99iBIE5e8DUVWPsqdDdtRAne5oXK2kWVu3PYIjx9l9f9A825gNicC4GAPRg0OOL54vaOgr8LFDJ9smpqK/p7gojCiSyzXltGqfajkpg== hadoop@data01
ssh-dss AAAAB3NzaC1kc3MAAACBAOpxZmz7oWUnhAiis2TiVWrBPaEtMZoxUYf8lmKKxtP+hM/lTDQyIK10bfKekJa52wNCR6q3lVxbFK0xHP04WHeb4Z0WjqNLNiE/U7h0gYCVG2M10sEFycy782jmBDwdc0R8MEy+nLRPmU5oPqcWBARxj0obg01PAj3wkfV+28zDAAAAFQC6a4yeCNX+lzIaGTd0nnxszMHhvwAAAIAevFSuiPi8Axa2ePP+rG/VS8QWcwmGcFZoR+K9TUdFJ4ZnfdKb4lqu78f9n68up2oJtajqXYuAzD08PerjWhPcLJAs/2qdGO1Ipiw/cXN2TyfHrnMcDr3+aEf7cUGHfWhwW4+1JrijHQ4Z9UeHNeEP6nU4I38FmS7gf9/f9MOVVwAAAIBlL1NsDXZUoEUXOws7tpMFfIaL7cXs7p5R+qk0BLdfllwUIwms++rKI9Ymf35l1U000pvaI8pz8s7I8Eo/dcCbWrpIZD1FqBMIqWhdG6sFP1qr9Nn4RZ00DxCz34ft4M8g+0CIn4Bg3pp4ZZES435R40F+jlrsnbLaXI+ixCzpqw== hadoop@data02
[hadoop@name01 .ssh]$

看到authorized_keys文件里面有3行记录,分别代表了访问name01,data01,data02的公用密钥。把这个authorized_keys公钥文件copy到data01和data02上面同一个目录下。

然后通过hadoop远程彼此连接name01、data01、data02就可以免密码了
scp authorized_keys hadoop@data01:/home/hadoop/.ssh/
scp authorized_keys hadoop@data02:/home/hadoop/.ssh/
然后分别在name01、data01、data02以hadoop用户执行权限赋予操作
su - hadoop 
chmod 600 /home/hadoop/.ssh/authorized_keys
chmod 700 -R /home/hadoop/.ssh


测试ssh免秘钥登录,首次连接的时候,需要输入yes,之后就不用输入密码直接可以ssh过去了。
[hadoop@name01 .ssh]$ ssh hadoop@data01
Last login: Thu Aug 21 01:53:24 2014 from name01
[hadoop@data01 ~]$ ssh hadoop@data02
The authenticity of host 'data02 (192.168.52.130)' can't be established.
RSA key fingerprint is 5b:22:7b:dc:0c:b8:bf:5c:92:aa:ff:93:3c:59:bd:d3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'data02,192.168.52.130' (RSA) to the list of known hosts.
[hadoop@data02 ~]$ ssh hadoop@name01
The authenticity of host 'name01 (::1)' can't be established.
RSA key fingerprint is 5b:22:7b:dc:0c:b8:bf:5c:92:aa:ff:93:3c:59:bd:d3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'name01' (RSA) to the list of known hosts.
Last login: Thu Aug 21 01:56:12 2014 from data01
[hadoop@data02 ~]$ ssh hadoop@name01
Last login: Thu Aug 21 01:56:22 2014 from localhost.localdomain
[hadoop@data02 ~]$ 
看到问题所在,从data01、data02上面ssh到name01上面没有成功,问题再哪里?


2.3.5 解决ssh name01失败的问题
[hadoop@data01 ~]$ ssh name01
Last login: Thu Aug 21 02:25:28 2014 from localhost.localdomain
[hadoop@data01 ~]$ 
确实没有成功,退出来看看/etc/hosts的设置
[hadoop@data01 ~]$ exit
logout
[root@data01 ~]#
[root@data01 ~]# vim /etc/hosts
#127.0.0.1      localhost.localdomain   localhost.localdomain   localhost4      localhost4.localdomain4 localhost       name01
#::1    localhost.localdomain   localhost.localdomain   localhost6      localhost6.localdomain6 localhost       name01
localhost.localdomain=data01
192.168.52.128    name01
192.168.52.129    data01
Hadoop2.3.0详细安装过程

分享到:
mongodb主从配置:副本集replica set
mongodb主从配置:副本集replica set1. 副本集 mongodb的主从部署,常用方式为副本集(replica set)方式。 replica set为多个mongod实例,组成一组group,包括一个主primary,多个从secondary。 这种一主多从的方式的好处是,mongodb自运维,如果主服务器挂了,会通过心跳自动检测,选举出一个新的主来。不需要人工处理。 ...
Use PRODUCT_USER_PROFILE To Limit User
Use PRODUCT_USER_PROFILE To Limit UserThe PRODUCT_USER_PROFILE (PUP) table provides product-level security that supplements the user-level security provided by the SQL GRANT and REVOKE commands and user roles. To create the PUP table, log in to SQL*Plus as the SYSTEM user and run PUPBLD.SQL which i...
  •         php迷,一个php技术的分享社区,专属您自己的技术摘抄本、收藏夹。
  • 在这里……