Friday, June 12, 2009

Troubleshooting Researching or Tracking Down Failed SQL Server Logins

I found such a dearth of usable information on this when I just searched for it on Google, that I thought I would write a post detailing the process I figured out.
For starters, I'm getting "Login failed for user 'NT AUTHORITY\ANONYMOUS LOGON'. [CLIENT: xxx.xxx.xxx.xxx]" repeatedly on one of my SQL Servers. It's logged in a variety of places, namely the Application Log in Event Viewer, with the following info:
Event Type: Failure Audit
Event Source: MSSQLSERVER

Event Category: (4)
Event ID: 18456
Date: 6/12/2009
Time: 10:17:21 AM
User: NT AUTHORITY\ANONYMOUS LOGON

Computer:
Description:
Login failed for user 'NT AUTHORITY\ANONYMOUS LOGON'. [CLIENT: XXX.XXX.XXX.XXX]

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:


And in the SQL Server Log, with this info:
Date 6/12/2009 9:45:21 AM
Log SQL Server (Current - 6/12/2009 9:45:00 AM)

Source Logon

Message

Login failed for user 'NT AUTHORITY\ANONYMOUS LOGON'. [CLIENT: XXX.XXX.XXX.XXX]

And a paired row stating:
Date 6/12/2009 9:45:21 AM
Log SQL Server (Current - 6/12/2009 9:45:00 AM
)

Source Logon


Message

Error: 18456, Severity: 14, State: 11.


While there are a couple of different things you can glean from these error messages, such as why it's failing (do a lookup based on the State. In this particular case State 11 means that the login is good, but the connection failed). You can see the state descriptions below:
State Error description
1 Account is locked out
2 User id is not valid
5 User id is not valid
7 The login being used is disabled
8 Incorrect password
9 Invalid password
11-12 Login valid but server access failed
16 Login valid, but not permissioned to use the target database
18 Password expired
27 Initial database could not be found
38 Login valid but database unavailable (or login not permissioned)

But that's not really what I wanted. I wanted the login attempts to stop. So I started profiler with a connection to the server with the failed logins. I started with a blank template, and added a single event:

Use Audit Login Failed under Security Audit, with all the columns.
Then look for the entry for the failed login:

It's kind of hard to see in the screenshot, but basically the important info you're looking for is in the ApplicationName, ClientProcessID, HostName, LoginName and/or SessionLoginName, and TextData.
In my particular case there wasn't any data in the ApplicationName, which could go a long way toward helping you figure it out. And I've seen suggestions online to use profiler to find this. Hopefully you'll be able to solve your problem using this column. The next column is what was really key for me:
ClientProcessID
Use this to go to the Task Manager on the client machine (found in the TextData column). If you only have an IP, use nbtstat -a or whois from the command line to find the machine name. Add the PID to the columns displayed in the Task Manager by choosing View->Select Columns from the menu:

Then check PID from the list that appears:

Then sort the Task Manager list by PID by clicking on it, and scroll down to the PID that matches the ClientProcessID column from Profiler:

In my case it was RpScheduler. This will lead you right to the offending service or client app. In my case I just disabled the service on the client machine because I don't need it. In other instances you may need to change the login info for the service, or the client application. This is really about finding the app, not so much about troubleshooting what you find.

Hopefully this helps someone, and feel free to comment with suggestions!