In last month, one engineer of our customers made a mistake and changed the file permissions of the whole RAC database wrongly, including GI and DB home.
Of course the clusterware could not be started again and I was called to fix this issue.
On the Oracle support website, below two notes are related with fixing GI file permission issues:
How to check and fix file permissions on Grid Infrastructure environment (Doc ID 1931142.1)
File Permission Is Set Incorrectly After Executing "rootcrs.sh -init" To Restore Grid Infrastructure Home File Permissions. (Doc ID 2346618.1)
The first one is for Oracle 12.1&12.2 and the second one is only for 12.1.
Doc ID 1931142.1 mentioned three methods and I did all of them to make sure the file permissions were fixed thoughtfully.
#run as root (GRID_HOME is just for convenience):
export GRID_HOME=/home/app/12.2.0.1/grid
export PATH=$GRID_HOME/perl/bin:$PATH:$GRID_HOME/OPatch
$GRID_HOME/bin/crsctl stop crs
$GRID_HOME/crs/install/rootcrs.sh -unlock
$GRID_HOME/crs/install/rootcrs.sh -lock
#$GRID_HOME/crs/install/rootcrs.sh -patch -->12.1 should run this command and 12.2 does not support this option
These above two options mostaly are used for RAC database patching, and the effect is like:
$GRID_HOME/crs/install/rootcrs.sh -init
for single instance GI (Oracle Restart):
$GRID_HOME/crs/install/roothas.sh -init
For 11.2,need to replace rootcrs.sh/roothas.sh with rootcrs.pl/roothas.pl
As the default Perl in the system maybe is not compatiable with the Oracle perl script, so the recommended way to run them is as below:
$GRID_HOME/perl/bin/perl rootcrs.pl
#run as root (GRID_HOME is just for convenience):
export GRID_HOME=/home/app/12.2.0.1/grid
export PATH=$GRID_HOME/perl/bin:$PATH:$GRID_HOME/OPatch
$GRID_HOME/bin/crsctl stop crs
$GRID_HOME/crs/install/rootcrs.sh -unlock
$GRID_HOME/crs/install/rootcrs.sh -lock
#$GRID_HOME/crs/install/rootcrs.sh -patch -->12.1 should run this command and 12.2 does not support this option
These above two options mostaly are used for RAC database patching, and the effect is like:
$GRID_HOME/crs/install/rootcrs.sh -init
for single instance GI (Oracle Restart):
$GRID_HOME/crs/install/roothas.sh -init
For 11.2,need to replace rootcrs.sh/roothas.sh with rootcrs.pl/roothas.pl
As the default Perl in the system maybe is not compatiable with the Oracle perl script, so the recommended way to run them is as below:
$GRID_HOME/perl/bin/perl rootcrs.pl
- #run as root (GRID_HOME is just for convenience):
- export GRID_HOME=/home/app/12.2.0.1/grid
- export PATH=$GRID_HOME/perl/bin:$PATH:$GRID_HOME/OPatch
- $GRID_HOME/bin/crsctl stop crs
- $GRID_HOME/crs/install/rootcrs.sh -unlock
- $GRID_HOME/crs/install/rootcrs.sh -lock
- #$GRID_HOME/crs/install/rootcrs.sh -patch -->12.1 should run this command and 12.2 does not support this option
- These above two options mostaly are used for RAC database patching, and the effect is like:
- $GRID_HOME/crs/install/rootcrs.sh -init
- for single instance GI (Oracle Restart):
- $GRID_HOME/crs/install/roothas.sh -init
- For 11.2,need to replace rootcrs.sh/roothas.sh with rootcrs.pl/roothas.pl
- As the default Perl in the system maybe is not compatiable with the Oracle perl script, so the recommended way to run them is as below:
- $GRID_HOME/perl/bin/perl rootcrs.pl
There is a bug for Oracle 12.1, so maybe have to run below part separately:
add x permission to following files under GI ORACLE_HOME
# chmod +x $GRID_HOME/bin/crs*
# chmod +x $GRID_HOME/crs/install/rootcrs.sh
# cd $GRID_HOME/crs/install
add x permission to following files under GI ORACLE_HOME
# chmod +x $GRID_HOME/bin/crs*
# chmod +x $GRID_HOME/crs/install/rootcrs.sh
run rootcrs.sh
# cd $GRID_HOME/crs/install
# rootcrs.sh -patch
- add x permission to following files under GI ORACLE_HOME
- # chmod +x $GRID_HOME/bin/crs*
- # chmod +x $GRID_HOME/crs/install/rootcrs.sh
- run rootcrs.sh
- # cd $GRID_HOME/crs/install
- # rootcrs.sh -patch
After the above fix, the clusterware could be started while the cluvfy command still showed lots of errors:
cluvfy comp software -n $(hostname) -verbose
#run as grid user:
cluvfy comp software -n $(hostname) -verbose
- #run as grid user:
- cluvfy comp software -n $(hostname) -verbose
Next continued to fix left issues according to the kept permission files under $GRID_HOME/crs/utl/$(hostname):
cat crsconfig_dirs|grep -E '(^all|^unix)'|grep -v "$GRID_HOME/racg/usrco"|while read unused fname owner group permission; do
chown $owner:$group $fname || echo failed on $fname
chmod $permission $fname || echo failed on $fname
chmod 755 $GRID_HOME/racg/usrco
cat crsconfig_fileperms|grep -E '(^all|^unix)'|while read unused fname owner group permission; do
chown $owner:$group $fname || echo failed on $fname
chmod $permission $fname || echo failed on $fname
cat crsconfig_dirs|grep -E '(^all|^unix)'|grep -v "$GRID_HOME/racg/usrco"|while read unused fname owner group permission; do
chown $owner:$group $fname || echo failed on $fname
chmod $permission $fname || echo failed on $fname
done
chmod 755 $GRID_HOME/racg/usrco
cat crsconfig_fileperms|grep -E '(^all|^unix)'|while read unused fname owner group permission; do
chown $owner:$group $fname || echo failed on $fname
chmod $permission $fname || echo failed on $fname
done
- cat crsconfig_dirs|grep -E '(^all|^unix)'|grep -v "$GRID_HOME/racg/usrco"|while read unused fname owner group permission; do
- chown $owner:$group $fname || echo failed on $fname
- chmod $permission $fname || echo failed on $fname
- done
- chmod 755 $GRID_HOME/racg/usrco
-
- cat crsconfig_fileperms|grep -E '(^all|^unix)'|while read unused fname owner group permission; do
- chown $owner:$group $fname || echo failed on $fname
- chmod $permission $fname || echo failed on $fname
- done
Then verified the permission issue again but still got some errors, so had to fix the last part according to the output result:
cluvfy comp software -n $(hostname) -verbose |grep 'PRVG-2033.*did not match the expected'|awk -F\" '{print $2" "$6}' > /tmp/grid.perm
cat /tmp/grid.perm|while read fname permission; do
chmod $permission $fname || echo failed on $fname
#run as grid user
cluvfy comp software -n $(hostname) -verbose |grep 'PRVG-2033.*did not match the expected'|awk -F\" '{print $2" "$6}' > /tmp/grid.perm
#run as root user:
cat /tmp/grid.perm|while read fname permission; do
chmod $permission $fname || echo failed on $fname
done
- #run as grid user
- cluvfy comp software -n $(hostname) -verbose |grep 'PRVG-2033.*did not match the expected'|awk -F\" '{print $2" "$6}' > /tmp/grid.perm
- #run as root user:
- cat /tmp/grid.perm|while read fname permission; do
- chmod $permission $fname || echo failed on $fname
- done
This time the cluvfy command run successfully.
And double checked one critical file:
ls -l $GRID_HOME/bin/oracle
-rwsr-s--x 1 grid oinstall 373913824 Dec 24 2019 /home/app/12.2.0.1/grid/bin/oracle
#If the result is different, then correct it using root user:
chown grid:oinstall $GRID_HOME/bin/oracle
chmod 6751 $GRID_HOME/bin/oracle
ls -l $GRID_HOME/bin/oracle
-rwsr-s--x 1 grid oinstall 373913824 Dec 24 2019 /home/app/12.2.0.1/grid/bin/oracle
#If the result is different, then correct it using root user:
chown grid:oinstall $GRID_HOME/bin/oracle
chmod 6751 $GRID_HOME/bin/oracle
- ls -l $GRID_HOME/bin/oracle
- -rwsr-s--x 1 grid oinstall 373913824 Dec 24 2019 /home/app/12.2.0.1/grid/bin/oracle
- #If the result is different, then correct it using root user:
- chown grid:oinstall $GRID_HOME/bin/oracle
- chmod 6751 $GRID_HOME/bin/oracle
Then went to fix the Oracle database home file permissions and I could not find similar ways to fix the permissions directly.
Oracle provided a perl script to duplicate permissions of a normal Oracle home and apply them on the target directory.
Script to capture and restore file permission in a directory (for eg. ORACLE_HOME) (Doc ID 1515018.1)
And below one for reference:
Oracle 11gR2 GI和DB安装目录权限属主被修改后的恢复方法
While the perl script did not work on the customer environment, so I finished the same thing using below commands:
#On the normal database, run as root:
find . -type d -exec stat -c "%n %U %G %a" {} \; > /tmp/orahome.dir
find . -type f -exec stat -c "%n %U %G %a" {} \; > /tmp/orahome.file
#On the target database, run as root:
cat /tmp/orahome.dir|while read fname owner group permission; do
[[ -d $fname ]] && { chown $owner:$group $fname || echo failed on $fname; }
[[ -d $fname ]] && { chmod $permission $fname || echo failed on $fname; }
cat /tmp/orahome.file|while read fname owner group permission; do
[[ -f $fname ]] && { chown $owner:$group $fname || echo failed on $fname; }
[[ -f $fname ]] && { chmod $permission $fname || echo failed on $fname; }
#On the normal database, run as root:
cd $ORACLE_HOME
find . -type d -exec stat -c "%n %U %G %a" {} \; > /tmp/orahome.dir
find . -type f -exec stat -c "%n %U %G %a" {} \; > /tmp/orahome.file
#On the target database, run as root:
cd $ORACLE_HOME
cat /tmp/orahome.dir|while read fname owner group permission; do
[[ -d $fname ]] && { chown $owner:$group $fname || echo failed on $fname; }
[[ -d $fname ]] && { chmod $permission $fname || echo failed on $fname; }
done
cat /tmp/orahome.file|while read fname owner group permission; do
[[ -f $fname ]] && { chown $owner:$group $fname || echo failed on $fname; }
[[ -f $fname ]] && { chmod $permission $fname || echo failed on $fname; }
done
- #On the normal database, run as root:
- cd $ORACLE_HOME
- find . -type d -exec stat -c "%n %U %G %a" {} \; > /tmp/orahome.dir
- find . -type f -exec stat -c "%n %U %G %a" {} \; > /tmp/orahome.file
-
- #On the target database, run as root:
- cd $ORACLE_HOME
- cat /tmp/orahome.dir|while read fname owner group permission; do
- [[ -d $fname ]] && { chown $owner:$group $fname || echo failed on $fname; }
- [[ -d $fname ]] && { chmod $permission $fname || echo failed on $fname; }
- done
-
- cat /tmp/orahome.file|while read fname owner group permission; do
- [[ -f $fname ]] && { chown $owner:$group $fname || echo failed on $fname; }
- [[ -f $fname ]] && { chmod $permission $fname || echo failed on $fname; }
- done
And double checked below critical file:
ls -l $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oracle asmadmin 409357968 May 27 07:18 /home/app/oracle/product/12.2.0.1/db1/bin/oracle
#If it is different, then correct it using root user
chown oracle:asmadmin $ORACLE_HOME/bin/oracle
chmod 6751 $ORACLE_HOME/bin/oracle
ls -l $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oracle asmadmin 409357968 May 27 07:18 /home/app/oracle/product/12.2.0.1/db1/bin/oracle
#If it is different, then correct it using root user
chown oracle:asmadmin $ORACLE_HOME/bin/oracle
chmod 6751 $ORACLE_HOME/bin/oracle
- ls -l $ORACLE_HOME/bin/oracle
- -rwsr-s--x 1 oracle asmadmin 409357968 May 27 07:18 /home/app/oracle/product/12.2.0.1/db1/bin/oracle
- #If it is different, then correct it using root user
- chown oracle:asmadmin $ORACLE_HOME/bin/oracle
- chmod 6751 $ORACLE_HOME/bin/oracle
Then the whole cluster worked well.