Hi All,
I have a table with two columns that I want to match but am unsure of
how to.
The Columns are called "User_Name" and Managed_By" the user_name is
entered as "Fred Flintstone" while the Managed_By is entered are
"Flintstone, Fred".
To the human eye you can see that they are the same person but how can
i do that match in SQL?
I am using a SQL 2000 server
Many Thanks
MarkSELECT U1.user_name, U1.managed_by, U2.user_name
FROM Users AS U1
LEFT JOIN Users AS U2
ON U2.user_name =
SUBSTRING(U1.managed_by,CHARINDEX(',',U1.managed_by)+2,40)+' '+
LEFT(U1.managed_by,CHARINDEX(',',U1.managed_by)-1)
Hopefully this will help you fix the table by adding keys and
referential integrity. It's obviously not a sensible design as it
stands.
David Portas
SQL Server MVP
--|||David,
Thanks for this,
I changed the table users to match my table name, however I got an
error massage when trying to excute the statments.
Any Ideas?
Cheers
Mark
<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>
(1 row(s) affected)
Server: Msg 536, Level 16, State 3, Line 1
Invalid length parameter passed to the substring function.
<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>|||I'd guess that the Managed_by column isn't consistent and that some of
the rows don't have commas between the two names. That's hardly
surprising if there are no constraints used. Try the following query to
take a look at the inconsistent data. Perhaps you'll have to clean it
up manually.
SELECT managed_by
FROM Users
WHERE managed_by NOT LIKE '_%,%_'
David Portas
SQL Server MVP
--|||David
I ran that query and there are 2210 rows which match, when i distinct
it it comes down to 14 names that match.
I then decided to run the first query agains only one name that meets
the critira "surname, firstname" the statment returned 24500 matches
but there is only 196 entries in the table
Any ideas?
Thanks
Mark|||If you need more help please post some code that will reproduce the
problem so that I and others don't have to keep guessing: CREATE TABLE
statement for the table and some INSERT statements of a few rows of
sample data.
It now seems you have duplicate rows in your table. As the data quality
is so poor it may be easier for just 196 rows to create a new table
(with a key), insert the distinct names and then check and re-key the
manager references by hand.
David Portas
SQL Server MVP
--
No comments:
Post a Comment