试用Nginx + PHP FastCGI 做WEB服务器,运行了几个月的时间,烦恼的是经常碰到Nginx 502 Bad Gateway 这个问题。 参考了很多修改办法,这个502的问题一直存在,今天打算重装PHP FastCGI到PHP的安装目录里一看,发现PHP的日志文件已经有几十M的大小,打开一看,结果基本全部都是一下的错误:

Jan 11 08:54:01.164292 [NOTICE] fpm_children_make(), line 352: child 10088 (pool default) started
Jan 11 08:54:01.164325 [WARNING] fpm_children_bury(), line 215: child 7985 (pool default) exited on signal 15 SIGTERM after 63.778601 seconds from start
Jan 11 08:54:01.165485 [NOTICE] fpm_children_make(), line 352: child 10089 (pool default) started
Jan 11 08:54:01.165514 [WARNING] fpm_children_bury(), line 215: child 7999 (pool default) exited on signal 15 SIGTERM after 60.297326 seconds from start
Jan 11 08:54:01.166696 [NOTICE] fpm_children_make(), line 352: child 10090 (pool default) started
Jan 11 08:54:01.166727 [WARNING] fpm_children_bury(), line 215: child 8000 (pool default) exited on signal 15 SIGTERM after 60.296946 seconds from start
Jan 11 08:54:01.167855 [NOTICE] fpm_children_make(), line 352: child 10091 (pool default) started
Jan 12 04:00:50.443884 [NOTICE] fpm_children_make(), line 352: child 10127 (pool default) started
Jan 12 04:00:50.443917 [NOTICE] fpm_event_loop(), line 107: libevent: entering main loop
Jan 12 12:05:08.425141 [WARNING] fpm_request_check_timed_out(), line 158: child 10120, script '/home/htdocs/www/index.php' (pool default) execution timed out (30.051306 sec), terminating
Jan 12 12:05:08.929741 [NOTICE] fpm_got_signal(), line 48: received SIGCHLD
Jan 12 12:05:09.137341 [WARNING] fpm_children_bury(), line 215: child 10120 (pool default) exited on signal 15 SIGTERM after 29058.697774 seconds from start
Jan 13 01:16:43.058020 [NOTICE] fpm_pctl_exit(), line 81: exiting, bye-bye!
Jan 13 01:16:46.236418 [NOTICE] fpm_unix_init_main(), line 284: getrlimit(nofile): max:52000, cur:52000
Jan 13 01:16:46.236655 [NOTICE] fpm_event_init_main(), line 88: libevent: using epoll
Jan 13 01:16:46.610883 [NOTICE] fpm_init(), line 52: fpm is running, pid 14095
Jan 13 01:16:46.612247 [NOTICE] fpm_children_make(), line 352: child 14103 (pool default) started
.........................

查过网上的资源,基本都是认为是php线程打开文件句柄受限导致的错误。具体的解决的办法如下:

1、提升服务器的文件句柄打开

打开

/etc/security/limits.conf : (增加)
*    soft    nofile    51200
*    hard    nofile    51200
# vi /etc/security/limits.conf 加上
* soft nofile 51200
* hard nofile 51200

2、提升nginx的进程文件打开数

nginx.conf : worker_rlimit_nofile 51200;

3、修改php-fpm.conf文件,主要需要修改2处。

命令 ulimit -n 查看限制的打开文件数,php-fpm.conf 中的选项rlimit_files 确保和此数值一致。

<value name="max_requests">10240</value>

<value name="rlimit_files">51200</value>

4、# vi /etc/sysctl.conf

底部添加

fs.file-max=51200

注:

file-max与ulimit的关系与差别

1. file-max的含义

man proc,可得到file-max的描述:

fs.file-max : 该文件指定了可以分配的文件句柄的最大数目 fs.file-max 为512 乘以 processes (如128个process则为 65536);

/proc/sys/fs/file-max

              This  file defines a system-wide limit on the number of open files for all processes.  (See
              also setrlimit(2),  which  can  be  used  by  a  process  to  set  the  per-process  limit,
              RLIMIT_NOFILE,  on  the  number  of  files it may open.)  If you get lots of error messages
              about running out of file handles, try increasing this value:

即file-max是设置 系统所有进程一共可以打开的文件数量 。同时一些程序可以通过setrlimit调用,设置每个进程的限制。如果得到大量使用完文件句柄的错误信息,是应该增加这个值。

也就是说,这项参数是系统级别的。

echo 6553560 > /proc/sys/fs/file-max

或修改 /etc/sysctl.conf, 加入

fs.file-max = 6553560 

重启生效

2. ulimit的

Provides control over the resources available to the shell and to processes started by it, on systems that allow such control.

即设置当前shell以及由它启动的进程的资源限制。

显然,对服务器来说,file-max, ulimit都需要设置,否则就可能出现文件描述符用尽的问题,为了让机器在重启之后仍然有效,强烈建立作以下配置,以确保file-max, ulimit的值正确无误:

1. 修改/etc/sysctl.conf, 加入

fs.file-max = 6553560

2.系统默认的ulimit对文件打开数量的限制是1024,修改/etc/security/limits.conf并加入以下配置,永久生效

* soft nofile 65535 
* hard nofile 65535

修改完之后,重启即可生效