通过Git在RHEL 7.3/CentOS 7.3上构建Lustre主版本教程

来自Lustre文件系统
跳转至: 导航搜索

⚠️本教程是为那些想要探索Lustre前沿功能的开发者准备的。如果你正在评估Lustre用于生产,你应该选择一个稳定的Lustre版本

目的

本教程描述在x86_64、R RHEL/CentOS 7.3上从主分支构建和测试Lustre系统(MGS、MDT、MDS、OSS、OST、客户端)所需的步骤。

准备条件

  • 一台新装了RHEL/CentOS 7.3 x86_64机器,且已连接到互联网。
  • EPEL仓库:这是一个方便的git源。
  • 注意: 建议用于构建的机器上至少有1GB的内存。
  • 注意:确保SE-Linux功能被禁用。

概述

可用的预构建RPM

Lustre服务器不再需要打过补丁并编译过的内核。如果您需要该内核,可以从Whamcloud获得,且有单独的页面让你了解[使用预构建的RPM部署Lustre的教程|如何使用这些预建的RPM来设置Lustre]。本文档是为那些希望通过源码构建Lustre系统的人准备的。 请注意,如果你不修改服务器上的内核补丁,也可以使用预构建的Lustre服务器内核RPM,仅构建Lustre代码。Lustre客户端也不需要打补丁的内核。

补丁可以在Git源码库中找到。Lustre源码中包含了一个测试套件。本文档将介绍打补丁、构建Lustre和运行整个系统的基本测试的步骤。

构建过程

该过程需要为开发设置一个操作系统--这系统包括Lustre源、内核源和构建工具。一旦设置好,就可以对新内核进行修补、编译、运行和测试。有关构建基于RHEL RPM的内核的更多详细信息,请参考CentOS网站等。

机器及安装依赖项准备

在rhel6-master上新装RHEL 7.3后,请以用户root身份登录。

1. 安装内核开发工具:

# yum -y groupinstall "Development Tools"

安装"开发工具"的问题

如果开发工具组由于某些原因不可用,你可以执行下面的语句,安装某个需要的包:

# yum -y install automake xmlto asciidoc elfutils-libelf-devel zlib-devel binutils-devel newt-devel python-devel hmaccalc perl-ExtUtils-Embed rpm-build make gcc redhat-rpm-config patchutils git

2. 安装额外的依赖关系:

# yum -y install xmlto asciidoc elfutils-libelf-devel zlib-devel binutils-devel newt-devel python-devel hmaccalc perl-ExtUtils-Embed bison elfutils-devel audit-libs-devel

3. 安装EPEL 7:

# rpm -ivh http://download.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-9.noarch.rpm

4. 安装其他包:

# yum -y install pesign numactl-devel pciutils-devel ncurses-devel libselinux-devel

准备Lustre源码

1. 创建一个用户构建,主目录为/home/build。

# useradd -m build

2. 切换到build用户下,并转到到构建的$HOME目录下。

# su build
$ cd $HOME

3. 从git获取主分支:

$ git clone git://git.whamcloud.com/fs/lustre-release.git
$ cd lustre-release

4. 运行sh ./autogen.sh

5. 解决所有未解决的依赖关系,直到autogen.sh成功完成。成功后信息类似:

$ sh ./autogen.sh
configure.ac:10: installing 'config/config.guess'
configure.ac:10: installing 'config/config.sub'
configure.ac:12: installing 'config/install-sh'
configure.ac:12: installing 'config/missing'
libcfs/libcfs/autoMakefile.am: installing 'config/depcomp'
$

为Lustre准备一个补丁内核

你可以用不同的方法来准备Lustre的补丁内核。最简单的方法是从版本页面下载内置的RPM包。你需要以"kernel-"开头的包。在下载新的内核包后,你可以跳过下面的几个步骤,进入安装Lustre内核并重启一节。

如果你想要挑战更多,也可以自己打补丁,在这种情况下,请按照下面的步骤进行。

准备内核源码

在本教程中,内核是用rpmbuild--一个基于RPM的发行版特有的工具来构建的。

1. 获取内核源码。首先创建目录结构,然后从RPM中获取源代码。创建一个.rpmmacros文件,将内核源码安装到我们的用户目录中。

$ cd $HOME
$ mkdir -p kernel/rpmbuild/{BUILD,RPMS,SOURCES,SPECS,SRPMS}
$ cd kernel
$ echo '%_topdir %(echo $HOME)/kernel/rpmbuild' > ~/.rpmmacros

2. 安装内核源码:

$ rpm -ivh http://vault.centos.org/7.3.1611/updates/Source/SPackages/kernel-3.10.0-514.2.2.el7.src.rpm

3. 使用 rpmbuild 准备源码:

$ cd ~/kernel/rpmbuild
$ rpmbuild -bp --target=`uname -m` ./SPECS/kernel.spec

运行结果的结尾部分如下:

...
+ make ARCH=x86_64 oldnoconfig
scripts/kconfig/conf --olddefconfig Kconfig
#
# configuration written to .config
#
+ echo '# x86_64'
+ cat .config
+ find . '(' -name '*.orig' -o -name '*~' ')' -exec rm -f '{}' ';'
+ find . -name .gitignore -exec rm -f '{}' ';'
+ cd ..
+ exit 0

此时,我们已经有了应用了所有RHEL/CentOS补丁的内核源码,存放在~/kernel/rpmbuild/BUILD/kernel-3.10.0-514.2.2.el7/linux-3.10.0-514.2.2.el7.x86_64/目录下。

用Lustre代码对内核源码进行修补

1. 将Lustre树上的所有补丁收集到一个文件中

$ cd ~
$ rm -f ~/lustre-kernel-x86_64-lustre.patch
$ cd ~/lustre-release/lustre/kernel_patches/series
$ for patch in $(<"3.10-rhel7.series"); do \
      patch_file="$HOME/lustre-release/lustre/kernel_patches/patches/${patch}"; \
      cat "${patch_file}" >> "$HOME/lustre-kernel-x86_64-lustre.patch"; \
  done
$

2. 将内核补丁复制到RPM构建树中:

# cp ~/lustre-kernel-x86_64-lustre.patch ~/kernel/rpmbuild/SOURCES/patch-3.10.0-lustre.patch

3. 编辑内核说明文件 ~/kernel/rpmbuild/SPECS/kernel.spec:

找到"find $RPM_BUILD_ROOT/lib/modules/$KernelVer"所在行,并在它下面插入以下两行:

cp -a fs/ext3/* $RPM_BUILD_ROOT/lib/modules/$KernelVer/build/fs/ext3 
cp -a fs/ext4/* $RPM_BUILD_ROOT/lib/modules/$KernelVer/build/fs/ext4

找到"# empty final patch to facilitate testing of kernel patches"所在行,并在它下面插入以下两行内容:

# adds Lustre patches
Patch99995: patch-%{version}-lustre.patch

找到"ApplyOptionalPatch linux-kernel-test.patch "这一行,并在它下面插入以下两行内容:

# lustre patch
ApplyOptionalPatch patch-%{version}-lustre.patch

找到"%define listnewconfig_fail 1"这一行,把其中的1改为0。 保存并关闭说明文件。

4. 用~/lustre-release/lustre/kernel_patches/kernel_configs/kernel-3.10.0-3.10-rhel7-x86_64.config覆盖内核配置文件:

echo '# x86_64' > ~/kernel/rpmbuild/SOURCES/kernel-3.10.0-x86_64.config
cat ~/lustre-release/lustre/kernel_patches/kernel_configs/kernel-3.10.0-3.10-rhel7-x86_64.config >> ~/kernel/rpmbuild/SOURCES/kernel-3.10.0-x86_64.config

将新内核编译成RPM

1. 使用rpmbuild开始构建内核:

$ cd ~/kernel/rpmbuild
$ buildid="_lustre" # Note: change to any string that identify your work
$ rpmbuild -ba --with firmware --target x86_64 --with baseonly \
           --define "buildid ${buildid}" \
           ~/kernel/rpmbuild/SPECS/kernel.spec

2. 构建成功则会返回:

...
...
Wrote: /mnt/home/build/kernel/rpmbuild/SRPMS/kernel-3.10.0-514.2.2.el7_lustre.src.rpm
Wrote: /mnt/home/build/kernel/rpmbuild/RPMS/x86_64/kernel-3.10.0-514.2.2.el7_lustre.x86_64.rpm
Wrote: /mnt/home/build/kernel/rpmbuild/RPMS/x86_64/kernel-headers-3.10.0-514.2.2.el7_lustre.x86_64.rpm
Wrote: /mnt/home/build/kernel/rpmbuild/RPMS/x86_64/kernel-debuginfo-common-x86_64-3.10.0-514.2.2.el7_lustre.x86_64.rpm
Wrote: /mnt/home/build/kernel/rpmbuild/RPMS/x86_64/perf-3.10.0-514.2.2.el7_lustre.x86_64.rpm
Wrote: /mnt/home/build/kernel/rpmbuild/RPMS/x86_64/perf-debuginfo-3.10.0-514.2.2.el7_lustre.x86_64.rpm
Wrote: /mnt/home/build/kernel/rpmbuild/RPMS/x86_64/python-perf-3.10.0-514.2.2.el7_lustre.x86_64.rpm
Wrote: /mnt/home/build/kernel/rpmbuild/RPMS/x86_64/python-perf-debuginfo-3.10.0-514.2.2.el7_lustre.x86_64.rpm
Wrote: /mnt/home/build/kernel/rpmbuild/RPMS/x86_64/kernel-tools-3.10.0-514.2.2.el7_lustre.x86_64.rpm
Wrote: /mnt/home/build/kernel/rpmbuild/RPMS/x86_64/kernel-tools-libs-3.10.0-514.2.2.el7_lustre.x86_64.rpm
Wrote: /mnt/home/build/kernel/rpmbuild/RPMS/x86_64/kernel-tools-libs-devel-3.10.0-514.2.2.el7_lustre.x86_64.rpm
Wrote: /mnt/home/build/kernel/rpmbuild/RPMS/x86_64/kernel-tools-debuginfo-3.10.0-514.2.2.el7_lustre.x86_64.rpm
Wrote: /mnt/home/build/kernel/rpmbuild/RPMS/x86_64/kernel-devel-3.10.0-514.2.2.el7_lustre.x86_64.rpm
Wrote: /mnt/home/build/kernel/rpmbuild/RPMS/x86_64/kernel-debuginfo-3.10.0-514.2.2.el7_lustre.x86_64.rpm
Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.F7X9cL
+ umask 022
+ cd /mnt/home//build/kernel/rpmbuild/BUILD
+ cd kernel-3.10.0-514.2.2.el7
+ rm -rf /mnt/home/build/kernel/rpmbuild/BUILDROOT/kernel-3.10.0-514.2.2.el7_lustre.x86_64
+ exit 0

如果你收到一个产生更多熵的请求,你需要触发一些磁盘I/O或键盘I/O。在另一个终端中,你可以随机键入或执行以下命令来产生熵:

# grep -Ri 'entropy' /usr

此时,你应该有一个新的内核 RPM ~/kernel/rpmbuild/RPMS/x86_64/kernel-[devel-]3.10.0-514.2.2.el7_lustre.x86_64.rpm。

安装Lustre内核并重启

1. 以root身份安装内核和kernel-devel包:

# rpm -ivh $PKG_PATH/kernel-3.10.0-514.2.2.el7_lustre.x86_64.rpm $PKG_PATH/kernel-devel-3.10.0-514.2.2.el7_lustre.x86_64.rpm

根据你获得内核包的方式,PKG_PATH应该是~build/kernel/rpmbuild/RPMS/x86_64,如果你自己构建了这些包,也可以是你下载预构建包的其他目录。

2. 重新启动系统。

3. 重启后登录系统:

# uname -r
3.10.0-514.2.2.el7_lustre.x86_64

现在你已经运行了一个Lustre补丁内核了

配置和构建Lustre

1. 配置Lustre源:

$ cd ~/lustre-release/
$ ./configure
...
...
CC:            gcc
LD:            /bin/ld -m elf_x86_64
CPPFLAGS:      -include /mnt/home/build/lustre-release/undef.h -include /mnt/home/build/lustre-release/config.h -I/mnt/home/build/lustre-release/libcfs/include -I/mnt/home/build/lustre-release/lnet/include -I/mnt/home/build/lustre-release/lustre/include
CFLAGS:        -g -O2 -Wall -Werror
EXTRA_KCFLAGS: -include /mnt/home/build/lustre-release/undef.h -include /mnt/home/build/lustre-release/config.h  -g -I/mnt/home/build/lustre-release/libcfs/include -I/mnt/home/build/lustre-release/lnet/include -I/mnt/home/build/lustre-release/lustre/include
 
Type 'make' to build Lustre.

2. 制作RPM:

$ make rpms
...
...
Wrote: /tmp/rpmbuild-lustre-build-JZiW94sq/RPMS/x86_64/lustre-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
Wrote: /tmp/rpmbuild-lustre-build-JZiW94sq/RPMS/x86_64/kmod-lustre-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
Wrote: /tmp/rpmbuild-lustre-build-JZiW94sq/RPMS/x86_64/kmod-lustre-osd-ldiskfs-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
Wrote: /tmp/rpmbuild-lustre-build-JZiW94sq/RPMS/x86_64/lustre-osd-ldiskfs-mount-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
Wrote: /tmp/rpmbuild-lustre-build-JZiW94sq/RPMS/x86_64/lustre-tests-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
Wrote: /tmp/rpmbuild-lustre-build-JZiW94sq/RPMS/x86_64/kmod-lustre-tests-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
Wrote: /tmp/rpmbuild-lustre-build-JZiW94sq/RPMS/x86_64/lustre-iokit-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
Wrote: /tmp/rpmbuild-lustre-build-JZiW94sq/RPMS/x86_64/lustre-debuginfo-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
Executing(%clean): /bin/sh -e /tmp/rpmbuild-lustre-build-JZiW94sq/TMP/rpm-tmp.SxgoFt
+ umask 022
+ cd /tmp/rpmbuild-lustre-build-JZiW94sq/BUILD
+ cd lustre-2.9.51_35_ge240fb5
+ rm -rf /tmp/rpmbuild-lustre-build-JZiW94sq/BUILDROOT/lustre-2.9.51_35_ge240fb5-1.x86_64
+ rm -rf /tmp/rpmbuild-lustre-build-JZiW94sq/TMP/kmp
+ exit 0
Executing(--clean): /bin/sh -e /tmp/rpmbuild-lustre-build-JZiW94sq/TMP/rpm-tmp.vYmwdb
+ umask 022
+ cd /tmp/rpmbuild-lustre-build-JZiW94sq/BUILD
+ rm -rf lustre-2.9.51_35_ge240fb5
+ exit 0

3. 你现在应该已经建立了类似以下名称的RPM:

$ ls *.rpm
kmod-lustre-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
kmod-lustre-osd-ldiskfs-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
kmod-lustre-tests-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
lustre-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
lustre-2.9.51_35_ge240fb5-1.src.rpm
lustre-debuginfo-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
lustre-iokit-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
lustre-osd-ldiskfs-mount-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm
lustre-tests-2.9.51_35_ge240fb5-1.el7.centos.x86_64.rpm

安装e2fsprogs

e2fsprogs是运行测试套件所需要的。

1. 从 https://downloads.whamcloud.com/public/e2fsprogs/latest/el7/RPMS/x86_64/ 下载e2fsprogs包,并安装e2fsprogs、e2fsprogs-libs、libcom_err、libss。

2. 或者最好用yum安装:

# cat <<EOF > /etc/yum.repos.d/e2fsprogs.repo
[e2fsprogs-el7-x86_64]
name=e2fsprogs-el7-x86_64
baseurl=https://downloads.whamcloud.com/public/e2fsprogs/latest/el7/
enabled=1
priority=1
EOF
  
# yum update e2fsprogs

安装Lustre

切换为root用户,并转到~build/lustre-release/目录下。

# yum localinstall *.x86_64.rpm

禁用SELinux(Lustre服务器)

在RHEL/CentOS中默认开启的SELinux,会阻止各种Lustre目标的格式化命令完成。 因此,您必须禁用它或调整设置。下面的说明将解释如何禁用它。

1. 运行getenforce来查看SELinux是否已启用。它应该返回"Enforcing(正在执行)"或"Disabled(已禁用)"。

2. 要禁用它,请编辑/etc/selinux/config,并将"selinux=enforcing "改为"selinux=disabled"。

3. 最后,重新启动你的系统。

# vi /etc/selinux/config
 
----
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
# targeted - Only targeted network daemons are protected.
# strict - Full SELinux protection.
SELINUXTYPE=targeted
---
# shutdown -r now

测试

1. 运行/usr/lib64/lustre/tests/llmount.sh:

# /usr/lib64/lustre/tests/llmount.sh
Stopping clients: onyx-21vm8.onyx.whamcloud.com /mnt/lustre (opts:)
Stopping clients: onyx-21vm8.onyx.whamcloud.com /mnt/lustre2 (opts:)
Loading modules from /usr/lib64/lustre/tests/..
detected 1 online CPUs by sysfs
libcfs will create CPU partition based on online CPUs
debug=vfstrace rpctrace dlmtrace neterror ha config                   ioctl super lfsck
subsystem_debug=all
gss/krb5 is not supported
Formatting mgs, mds, osts
Format mds1: /tmp/lustre-mdt1
Format ost1: /tmp/lustre-ost1
Format ost2: /tmp/lustre-ost2
Checking servers environments
Checking clients onyx-21vm8.onyx.whamcloud.com environments
Loading modules from /usr/lib64/lustre/tests/..
detected 1 online CPUs by sysfs
libcfs will create CPU partition based on online CPUs
debug=vfstrace rpctrace dlmtrace neterror ha config                   ioctl super lfsck
subsystem_debug=all
gss/krb5 is not supported
Setup mgs, mdt, osts
Starting mds1:   -o loop /tmp/lustre-mdt1 /mnt/lustre-mds1
Commit the device label on /tmp/lustre-mdt1
Started lustre-MDT0000
Starting ost1:   -o loop /tmp/lustre-ost1 /mnt/lustre-ost1
Commit the device label on /tmp/lustre-ost1
Started lustre-OST0000
Starting ost2:   -o loop /tmp/lustre-ost2 /mnt/lustre-ost2
Commit the device label on /tmp/lustre-ost2
Started lustre-OST0001
Starting client: onyx-21vm8.onyx.whamcloudcom:  -o user_xattr,flock onyx-21vm8.onyx.whamcloud.com@tcp:/lustre /mnt/lustre
UUID                   1K-blocks        Used   Available Use% Mounted on
lustre-MDT0000_UUID       125368        1736      114272   1% /mnt/lustre[MDT:0]
lustre-OST0000_UUID       350360       13492      309396   4% /mnt/lustre[OST:0]
lustre-OST0001_UUID       350360       13492      309396   4% /mnt/lustre[OST:1]
 
filesystem_summary:       700720       26984      618792   4% /mnt/lustre
 
Using TIMEOUT=20
seting jobstats to procname_uid
Setting lustre.sys.jobid_var from disable to procname_uid
Waiting 90 secs for update
Updated after 7s: wanted 'procname_uid' got 'procname_uid'
disable quota as required

2. 现在您已经在/mnt/lustre有一个Lustre文件系统。