Troubleshooting

I/OTest script to check if the disk I/O is causing slow performance

Slow VM Performacne, use IOTest to see if the disk IO is the culprit

This script will test the disk IO by copying 500Mb of data using the same block size as eDir uses and with the same api eDir uses “fdatasync”.
This writes 500 Mb of data each iteration to the iotest.log in the dib directory, usually the /var/opt/novell/eDirectory/data/dib/
It will overwrite the previous data in the iotest.log each time it runs.  Anything under 100 MB/s is a concern and will cause slowness for eDirectory and possible memory build up.  IO causes a bottleneck for events to be written to disk.  A build up of memory by ndsd can cause a ndsd to take all available memory (both virtual and resident) causing ndsd to core.

If slow IO writes are seen with the iotest script begin the process of adding hard drives and reducing the… Continue reading

New DSfW Monitor Script

I previously created two scripts, dsfw_processcheck.sh and dsfw_portchk.sh, one to monitor pids and one to monitor ports.  With the two script they are helpful to ensure the DSfW services are up.  A new script combines the two and adds additional options.  The script not only checks for pids and ports, but it can be used to create a cron job to run the script every 10 minutes by adding the “add” switch.  To remove the cron job use the “rm” switch.

If a DSfW server running DNS (or not) has a DSfW specific process stop or crash a quick stop gap measure is to monitor the DSfW processes and restart them if one or more of the DSfW processes stop.

If the DSfW server is an Additional Domain Controller (ADC) DNS might not be configured on the server.  If DNS is not running on the… Continue reading

NDSD Health Check Script

I’ve received a great deal of feed back on the DSfW Health Check Script and applied some changes. One of the suggestions was to do only a ndsd (eDirectory) script. The DSfW Health Check Script works for both DSfW and eDirectory servers, but if all you want to do is check eDirectory health on a DSfW server or want a script only for ndsd that is smaller and simple this is an option.

I am always looking for suggestions. I’ve created a video for the ndsd_heaclthchk script. Watch to to learn about configuring it for your specific needs.

For for NDSD Health Check in the download section.

The configuration options are as follows

# Set emailsetting to 1 to send e-mail log when finished. Set to 0 or remove the 1 to disable
emailsetting=0

# Set emailonerror to 1 to send e-mail log if an error is returned. Set to… Continue reading

Latest DSfW Health Check Script

I’ve received a great deal of feed back on the DSfW Health Check Script and applied some changes.
I am always looking for suggestions. I’ve created an updated video with the latest script. Watch to to learn about configuring it for your specific needs.

 

Troubleshooting High Utilization – High Utilization Gstack tool

Some times ndsd or another process can cause a server to go into high utilization or to become unresponsive.  A great TID to follow for OES servers is TID 7007332 – Troubleshooting ndsd becoming unresponsive on OES Linux.  A TID specific for DSfW servers to start with is TID 7010462- Troubleshooting slow logins and unresponsive DSfW server.

When trouble shooting a process stuck in high utilization or causing a server to slow down or become unresponsive looking at a top output for a daemon like ndsd with individual threads shown and a correlating gstack can show us which thread is in high utilization and what that thread is doing.  In most cases it is best to take a number of gstacks every 10 seconds to 60 seconds depending on the situation.  We can see not only what that thread is doing but if the… Continue reading

DSfW and eDirectory Health Check

It is a good idea to periodically check the health of DSfW and eDirectory servers.

This video concentrates on a script I wrote that can be ran on both eDirectory and DSfW servers.

The script demonstrated in this video is called dsfw_edir_healthchk.sh.  To get the latest version of the script click on the DSfW Health Check link in the download section on DSfWDude.com.

A great TID to start off with for a eDirectory health check is TID 3564075.
On a DSfW server start off with an eDirectory health check as well as TID 7001884 which has DSfW specific commands to check the health and overall operation of a DSfW server.

The script does most of the suggestions in both TIDs mentioned above plus a few more checks.

For eDirectory there are 8 checks the script does and… Continue reading

Install error: ndsconfig error 74

Installs can be tricky especially when installing into an existing tree that has been around since NetWare 4.11, has multiple partitions, several locations, and dozens of servers.  If the tree is not healthy the install of DSfW has a greater chance of failure.  If communication with all servers is good, the tree is healthy, and the Preparing for Domain Services for Windows Install TID is followed then usually the install goes through with out any issues.

If there is a failure a common error is ndsconfig error 74.   This video goes over the error.  The troubleshooting of this error can be applied to a similar error “ndsconfig error 80”.

DSfW Slow Performance/Group Types

DSfW, like AD, has multiple group types.  This is found in the grouptype attribute.  TID 7004405 goes over the three group types.

Domain Local group: -2147483644
Global group: -2147483646
Universal group: -2147483640

The default group type is Universal group.   This group type can generate a lot of extra traffic causing the performance of the domain controller to suffer.

Global and Universal groups calculate a virtual attribute called tokenGroupsDomainLocal. This attribute is calculated for the group by the slapi layer. When a user is a member of several groups login times can increase. An increase in ndsd utilization can also result from the calculation of the tokenGroupsDomainLocal when a large number of groups reside within the domain.

If ndsd utilization is high or login times need to be reduced, change groups to Domain Local groups to avoid the calculation of the tokenGroupsDomainLocal virtual attribute.

Here is a… Continue reading

Diagnostic tool for DNS Records

The DSfW team has a great tool called check-dns.pl to help diagnose DNS issue with DSfW.

The tool validates essential records for forward and reverse lookups.  This tool can be found at Novell Coolsolutions.

The tool might incorrectly report PDC and DC records if there is more than one Domain Controller.  The Coolsolutions article will be updated with a new check-dns.pl to address this issue.

Until the Coolsolutions article is updated you can download it from dsfwdude.com.

Download

Script to check if ports are listening

If you are concerned about a DSfW service going down and or the port is not accessible, this script will help keep the services up or notify you of a service going down.  The script will check if each DSfW service is listening, then telnet to each port.  If it can not telnet, the script will log which port is not accessable in the /var/opt/novell/xad/log/dsfw_portchk.log.

The dsfw_portchk.sh script can be ran on PDC or ADC, running Novell DNS or not running Novell DNS.

The script can also e-mail and restart the services if desired.

It will detect if the server has IPv6 enabled so to properly detect the correct port Samba and NetBios is listening on.

The script detects if Novell DNS is configured to start.  Some times on ADC servers DNS is not configured or is not set to run.  The original script… Continue reading

Delete an attribute on all users with a script

Here is the bases of a script to delete an attribute on a user.

I come across issues where an attribute was populated on several users that shouldn’t be there or you want to create new objectsids or just remove the existing objectsids and replace them with a back up.

Most DSfW installs are a name mapped install meaning the install is mapped to an existing container in the tree.  If this is the case the domain name most likely will not patch to context in the tree and most likely the objectclass wit not be domain.  An example of a domain with the name of  novell.com mapped to a container with an objectclass of Organization (o=novell) and not domain (dc=novell).  Even it if is a dc most likely the fdn does not match the domain name.  Continuing with our example of novell.com that would… Continue reading

Script to monitor DSfW processes and restart services

If a DSfW server running DNS has a DSfW specific process stop or crash a quick stop gap mesure is to monitor the DSfW processes and restart them if one or more of the DSfW processes stop.  I created a simple script that will check that a pid exists for each process.  The script is called dsfw_monitor.sh.  While it does not restart DSfW in every condition like if a process continues to run but is not responding or say a process crashes but the pid is never cleaned up, it does work for most situations.

Create a cron job to run the script every hour, 30 minutes, 10 minutes, what ever you desire.  My recomendation is to not go below 5 minutes since eDirectory might take several minutes to stop and start again.

To create a cronjob use the crontab command with the -e… Continue reading

Categories