Troubleshooting

I/OTest script to check if the disk I/O is causing slow performance

Slow VM Performacne, use IOTest to see if the disk IO is the culprit

This script will test the disk IO by copying 500Mb of data using the same block size as eDir uses and with the same api eDir uses “fdatasync”.
This writes 500 Mb of data each iteration to the iotest.log in the dib directory, usually the /var/opt/novell/eDirectory/data/dib/
It will overwrite the previous data in the iotest.log each time it runs.  Anything under 100 MB/s is a concern and will cause slowness for eDirectory and possible memory build up.  IO causes a bottleneck for events to be written to disk.  A build up of memory by ndsd can cause a ndsd to take all available memory (both virtual and resident) causing ndsd to core.

If slow IO writes are seen with the iotest script begin the process of adding hard drives and reducing the… Continue reading

SCA Appliance

Ever wonder what happens when you run a supportconfig -ur SR#?  The support config gets uploaded and analyzed by a Support Config Analysis server that runs potentially over 900 support patterns to analyze the support configs contents.  The report is then posted to the SR listing critical issues than when fixed have been found to fix roughly 50% of the issues an SR was created for.

The Support Config Analysis server is available for download as an appliance than can be ran on premises.  The appliance stores analysis results in a MariaDB database and uses PHP to read the database and generate the report.  It has a FTP server allowing for support configs to be uploaded, archived, processed, and analyzed.  With this it is possible to modify the supportconfig script to gather more information for other applications running on the server and… Continue reading

DNS CASA Repair Script

A common reason for  Novell DNS to fail to update records or even fail to load is do to CASA credentials for the DNS proxy user.

When troubleshooting Novell DNS issues start with the /var/opt/novell/log/named/named.run log.
If novell-named fails to start or update records and CASA Error has occured, error:No credential is retrived from CASA is seen in the log, it is almost a guarantee the reason is the dns-ldap key is missing, the password is incorrect for the proxy user, or the user name is incorrect.
Below is a sample of a named.run log demonstrating what is seen when CASA credentials in invalid or missing.
Look for the starting of named and the CASA Error

19-Nov-2013 15:30:13.489 general: main: notice: starting BIND 9.3.2 -u named
19-Nov-2013 15:30:13.490 general: server: info: found 4 CPUs, using 4 worker threads
19-Nov-2013 15:30:13.500 general: dns/message: error: Credential Not found
19-Nov-2013… Continue reading

New DSfW Monitor Script

I previously created two scripts, dsfw_processcheck.sh and dsfw_portchk.sh, one to monitor pids and one to monitor ports.  With the two script they are helpful to ensure the DSfW services are up.  A new script combines the two and adds additional options.  The script not only checks for pids and ports, but it can be used to create a cron job to run the script every 10 minutes by adding the “add” switch.  To remove the cron job use the “rm” switch.

If a DSfW server running DNS (or not) has a DSfW specific process stop or crash a quick stop gap measure is to monitor the DSfW processes and restart them if one or more of the DSfW processes stop.

If the DSfW server is an Additional Domain Controller (ADC) DNS might not be configured on the server.  If DNS is not running on the… Continue reading

NDSD Health Check Script

I’ve received a great deal of feed back on the DSfW Health Check Script and applied some changes. One of the suggestions was to do only a ndsd (eDirectory) script. The DSfW Health Check Script works for both DSfW and eDirectory servers, but if all you want to do is check eDirectory health on a DSfW server or want a script only for ndsd that is smaller and simple this is an option.

I am always looking for suggestions. I’ve created a video for the ndsd_heaclthchk script. Watch to to learn about configuring it for your specific needs.

For for NDSD Health Check in the download section.

The configuration options are as follows

# Set emailsetting to 1 to send e-mail log when finished. Set to 0 or remove the 1 to disable
emailsetting=0

# Set emailonerror to 1 to send e-mail log if an error is returned. Set to… Continue reading

Latest DSfW Health Check Script

I’ve received a great deal of feed back on the DSfW Health Check Script and applied some changes.
I am always looking for suggestions. I’ve created an updated video with the latest script. Watch to to learn about configuring it for your specific needs.

 

OES 11 SP1 eDirectory Install

Looking to install  eDirectory on OES 11 SP1?  Here is a video going through the install and giving some tips on doing a successful install.

 

Script to modify grouptype – modify_grouptype.sh

I did a post in January on grouptype and their impact on DSfW performance in TID 7011498 DSfW Slow Performance/Group Types.  The post shows how to find grouptype settings, goes over a couple of TIDs discussing grouptypes and a video giving even more info on the subject.

I have since written a script that will easily search for and modify grouptypes.  Here is how the script works.  Executing the script will display the following menu:

This script can report grouptype information and change grouptypes

To run the script enter 1, 2, or 3

1. Display groups with grouptypes of Universal
2. Create a ldif to change grouptype to Domain Local
3. Create and apply ldif to change grouptype to Domain Local

Options                           … Continue reading

Troubleshooting High Utilization – High Utilization Gstack tool

Some times ndsd or another process can cause a server to go into high utilization or to become unresponsive.  A great TID to follow for OES servers is TID 7007332 – Troubleshooting ndsd becoming unresponsive on OES Linux.  A TID specific for DSfW servers to start with is TID 7010462- Troubleshooting slow logins and unresponsive DSfW server.

When trouble shooting a process stuck in high utilization or causing a server to slow down or become unresponsive looking at a top output for a daemon like ndsd with individual threads shown and a correlating gstack can show us which thread is in high utilization and what that thread is doing.  In most cases it is best to take a number of gstacks every 10 seconds to 60 seconds depending on the situation.  We can see not only what that thread is doing but if the… Continue reading

DSfW and eDirectory Health Check

It is a good idea to periodically check the health of DSfW and eDirectory servers.

This video concentrates on a script I wrote that can be ran on both eDirectory and DSfW servers.

The script demonstrated in this video is called dsfw_edir_healthchk.sh.  To get the latest version of the script click on the DSfW Health Check link in the download section on DSfWDude.com.

A great TID to start off with for a eDirectory health check is TID 3564075.
On a DSfW server start off with an eDirectory health check as well as TID 7001884 which has DSfW specific commands to check the health and overall operation of a DSfW server.

The script does most of the suggestions in both TIDs mentioned above plus a few more checks.

For eDirectory there are 8 checks the script does and… Continue reading

Install error: ndsconfig error 74

Installs can be tricky especially when installing into an existing tree that has been around since NetWare 4.11, has multiple partitions, several locations, and dozens of servers.  If the tree is not healthy the install of DSfW has a greater chance of failure.  If communication with all servers is good, the tree is healthy, and the Preparing for Domain Services for Windows Install TID is followed then usually the install goes through with out any issues.

If there is a failure a common error is ndsconfig error 74.   This video goes over the error.  The troubleshooting of this error can be applied to a similar error “ndsconfig error 80”.

Troubleshooting DSfW Slow Performance/Duplicate Workstation Names

Slow logins and poor performance of a DSfW Domain Controller is often due to too many failed authentications to the domain. This video goes over the specific issue of multiple workstations with the same name increasingly queuing up logins until the server comes to a halt. The video covers TIDs 7010462 and TID 7006851.

This video concentrates on TID 7006851.
The command to display and sort Decrypt integrity check failed errors is:
grep -A1 -i ‘Decrypt integrity check failed’ /var/opt/novell/xad/log/kdc.log |grep -v ‘Decrypt integrity check failed’ |awk -F ‘)’ ‘{print $3}’ |grep -v ‘^$’ |awk -F ‘for’ ‘{print $1}’ |sort -n | uniq -c | sort -n

DSfW Install Error: No Such Partition

An error I have seen during installs is No Such Partition. The majority of the time it is easly solved by adding a replica of the name mapped partition to the DSfW server. This video will go through troubleshooting steps for this error.


Either run them on the eDirectory server specidifed for the install or use the -h [ip] if running on the DSfW server.
ndsstat -r
ndstat -p .o=novell.t=tree. -h 192.168.0.51 -n

Updated dsfw_processchk script 2.1.5

I updated the dsfw_processchk script to not only check all essential DSfW processes, but to handle multiple pids for the xadsd process.  The script is great to use if you are worried that a DSfW process will stop and you don’t want to receive several phone calls alerting you to the problem or the DSfW server has been unstable you you need to time track down the invalid requests hitting the DSfW server.

The script will report which processes are running or have stopped. It works by validating that a PID exists for each process. If a process is not running the script has the option to restart the services, send an e-mail that a process has stopped, and update the syslog.

Key configuration

# Set RESTART_DSFW to 1 to reload DSfW services if one or service is not running,
# Set RESTART_DSFW to 0 to leave the services… Continue reading

Script to check if ports are listening

If you are concerned about a DSfW service going down and or the port is not accessible, this script will help keep the services up or notify you of a service going down.  The script will check if each DSfW service is listening, then telnet to each port.  If it can not telnet, the script will log which port is not accessable in the /var/opt/novell/xad/log/dsfw_portchk.log.

The dsfw_portchk.sh script can be ran on PDC or ADC, running Novell DNS or not running Novell DNS.

The script can also e-mail and restart the services if desired.

It will detect if the server has IPv6 enabled so to properly detect the correct port Samba and NetBios is listening on.

The script detects if Novell DNS is configured to start.  Some times on ADC servers DNS is not configured or is not set to run.  The original script… Continue reading

Script to check DSfW Processes

I have a updated script to check all essential DSfW processes.  The name of the script is dsfw_processchk.  The script is great to use if you are worried that a DSfW process will stop and you don’t want to receive several phone calls alerting you to the problem or the DSfW server has been unstable you you need to time track down the invalid requests hitting the DSfW server.

The script will report which processes are running or have stopped.  It works by validating that a PID exists for each process.  If a process is not running the script has the option to restart the services, send an e-mail that a process has stopped, and update the syslog.

Key configuration

# Set RESTART_DSFW to 1 to reload DSfW services if one or service is not running,
# Set RESTART_DSFW to 0 to leave the services… Continue reading

Delete an attribute on all users with a script

Here is the bases of a script to delete an attribute on a user.

I come across issues where an attribute was populated on several users that shouldn’t be there or you want to create new objectsids or just remove the existing objectsids and replace them with a back up.

Most DSfW installs are a name mapped install meaning the install is mapped to an existing container in the tree.  If this is the case the domain name most likely will not patch to context in the tree and most likely the objectclass wit not be domain.  An example of a domain with the name of  novell.com mapped to a container with an objectclass of Organization (o=novell) and not domain (dc=novell).  Even it if is a dc most likely the fdn does not match the domain name.  Continuing with our example of novell.com that would… Continue reading

Categories